linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge tag 'pm+acpi-4.2-rc1' of ↵	Linus Torvalds	2015-06-23	13	-492/+649
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI updates from Rafael Wysocki: "The rework of backlight interface selection API from Hans de Goede stands out from the number of commits and the number of affected places perspective. The cpufreq core fixes from Viresh Kumar are quite significant too as far as the number of commits goes and because they should reduce CPU online/offline overhead quite a bit in the majority of cases. From the new featues point of view, the ACPICA update (to upstream revision 20150515) adding support for new ACPI 6 material to ACPICA is the one that matters the most as some new significant features will be based on it going forward. Also included is an update of the ACPI device power management core to follow ACPI 6 (which in turn reflects the Windows' device PM implementation), a PM core extension to support wakeup interrupts in a more generic way and support for the ACPI _CCA device configuration object. The rest is mostly fixes and cleanups all over and some documentation updates, including new DT bindings for Operating Performance Points. There is one fix for a regression introduced in the 4.1 cycle, but it adds quite a number of lines of code, it wasn't really ready before Thursday and you were on vacation, so I refrained from pushing it on the last minute for 4.1. Specifics: - ACPICA update to upstream revision 20150515 including basic support for ACPI 6 features: new ACPI tables introduced by ACPI 6 (STAO, XENV, WPBT, NFIT, IORT), changes related to the other tables (DTRM, FADT, LPIT, MADT), new predefined names (_BTH, _CR3, _DSD, _LPI, _MTL, _PRR, _RDI, _RST, _TFP, _TSN), fixes and cleanups (Bob Moore, Lv Zheng). - ACPI device power management core code update to follow ACPI 6 which reflects the ACPI device power management implementation in Windows (Rafael J Wysocki). - rework of the backlight interface selection logic to reduce the number of kernel command line options and improve the handling of DMI quirks that may be involved in that and to make the code generally more straightforward (Hans de Goede). - fixes for the ACPI Embedded Controller (EC) driver related to the handling of EC transactions (Lv Zheng). - fix for a regression related to the ACPI resources management and resulting from a recent change of ACPI initialization code ordering (Rafael J Wysocki). - fix for a system initialization regression related to ACPI introduced during the 3.14 cycle and caused by running the code that switches the platform over to the ACPI mode too early in the initialization sequence (Rafael J Wysocki). - support for the ACPI _CCA device configuration object related to DMA cache coherence (Suravee Suthikulpanit). - ACPI/APEI fixes and cleanups (Jiri Kosina, Borislav Petkov). - ACPI battery driver cleanups (Luis Henriques, Mathias Krause). - ACPI processor driver cleanups (Hanjun Guo). - cleanups and documentation update related to the ACPI device properties interface based on _DSD (Rafael J Wysocki). - ACPI device power management fixes (Rafael J Wysocki). - assorted cleanups related to ACPI (Dominik Brodowski, Fabian Frederick, Lorenzo Pieralisi, Mathias Krause, Rafael J Wysocki). - fix for a long-standing issue causing General Protection Faults to be generated occasionally on return to user space after resume from ACPI-based suspend-to-RAM on 32-bit x86 (Ingo Molnar). - fix to make the suspend core code return -EBUSY consistently in all cases when system suspend is aborted due to wakeup detection (Ruchi Kandoi). - support for automated device wakeup IRQ handling allowing drivers to make their PM support more starightforward (Tony Lindgren). - new tracepoints for suspend-to-idle tracing and rework of the prepare/complete callbacks tracing in the PM core (Todd E Brandt, Rafael J Wysocki). - wakeup sources framework enhancements (Jin Qian). - new macro for noirq system PM callbacks (Grygorii Strashko). - assorted cleanups related to system suspend (Rafael J Wysocki). - cpuidle core cleanups to make the code more efficient (Rafael J Wysocki). - powernv/pseries cpuidle driver update (Shilpasri G Bhat). - cpufreq core fixes related to CPU online/offline that should reduce the overhead of these operations quite a bit, unless the CPU in question is physically going away (Viresh Kumar, Saravana Kannan). - serialization of cpufreq governor callbacks to avoid race conditions in some cases (Viresh Kumar). - intel_pstate driver fixes and cleanups (Doug Smythies, Prarit Bhargava, Joe Konno). - cpufreq driver (arm_big_little, cpufreq-dt, qoriq) updates (Sudeep Holla, Felipe Balbi, Tang Yuantian). - assorted cleanups in cpufreq drivers and core (Shailendra Verma, Fabian Frederick, Wang Long). - new Device Tree bindings for representing Operating Performance Points (Viresh Kumar). - updates for the common clock operations support code in the PM core (Rajendra Nayak, Geert Uytterhoeven). - PM domains core code update (Geert Uytterhoeven). - Intel Knights Landing support for the RAPL (Running Average Power Limit) power capping driver (Dasaratharaman Chandramouli). - fixes related to the floor frequency setting on Atom SoCs in the RAPL power capping driver (Ajay Thomas). - runtime PM framework documentation update (Ben Dooks). - cpupower tool fix (Herton R Krzesinski)" * tag 'pm+acpi-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (194 commits) cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state x86: Load __USER_DS into DS/ES after resume PM / OPP: Add binding for 'opp-suspend' PM / OPP: Allow multiple OPP tables to be passed via DT PM / OPP: Add new bindings to address shortcomings of existing bindings ACPI: Constify ACPI device IDs in documentation ACPI / enumeration: Document the rules regarding the PRP0001 device ID ACPI / video: Make acpi_video_unregister_backlight() private acpi-video-detect: Remove old API toshiba-acpi: Port to new backlight interface selection API thinkpad-acpi: Port to new backlight interface selection API sony-laptop: Port to new backlight interface selection API samsung-laptop: Port to new backlight interface selection API msi-wmi: Port to new backlight interface selection API msi-laptop: Port to new backlight interface selection API intel-oaktrail: Port to new backlight interface selection API ideapad-laptop: Port to new backlight interface selection API fujitsu-laptop: Port to new backlight interface selection API eeepc-laptop: Port to new backlight interface selection API dell-wmi: Port to new backlight interface selection API ...
\| *	cpufreq: dt: allow driver to boot automatically	Felipe Balbi	2015-06-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	by adding the missing MODULE_ALIAS(), cpufreq-dt can be autoloaded by udev/systemd. Signed-off-by: Felipe Balbi <balbi@ti.com> Acked-by: Nishanth Menon <nm@ti.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	intel_pstate: Fix overflow in busy_scaled due to long delay	Prarit Bhargava	2015-06-16	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The kernel may delay interrupts for a long time which can result in timers being delayed. If this occurs the intel_pstate driver will crash with a divide by zero error: divide error: 0000 [#1] SMP Modules linked in: btrfs zlib_deflate raid6_pq xor msdos ext4 mbcache jbd2 binfmt_misc arc4 md4 nls_utf8 cifs dns_resolver tcp_lp bnep bluetooth rfkill fuse dm_service_time iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ftp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables intel_powerclamp coretemp vfat fat kvm_intel iTCO_wdt iTCO_vendor_support ipmi_devintf sr_mod kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel cdc_ether lrw usbnet cdrom mii gf128mul glue_helper ablk_helper cryptd lpc_ich mfd_core pcspkr sb_edac edac_core ipmi_si ipmi_msghandler ioatdma wmi shpchp acpi_pad nfsd auth_rpcgss nfs_acl lockd uinput dm_multipath sunrpc xfs libcrc32c usb_storage sd_mod crc_t10dif crct10dif_common ixgbe mgag200 syscopyarea sysfillrect sysimgblt mdio drm_kms_helper ttm igb drm ptp pps_core dca i2c_algo_bit megaraid_sas i2c_core dm_mirror dm_region_hash dm_log dm_mod CPU: 113 PID: 0 Comm: swapper/113 Tainted: G W -------------- 3.10.0-229.1.2.el7.x86_64 #1 Hardware name: IBM x3950 X6 -[3837AC2]-/00FN827, BIOS -[A8E112BUS-1.00]- 08/27/2014 task: ffff880fe8abe660 ti: ffff880fe8ae4000 task.ti: ffff880fe8ae4000 RIP: 0010:[<ffffffff814a9279>] [<ffffffff814a9279>] intel_pstate_timer_func+0x179/0x3d0 RSP: 0018:ffff883fff4e3db8 EFLAGS: 00010206 RAX: 0000000027100000 RBX: ffff883fe6965100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000010 RDI: 000000002e53632d RBP: ffff883fff4e3e20 R08: 000e6f69a5a125c0 R09: ffff883fe84ec001 R10: 0000000000000002 R11: 0000000000000005 R12: 00000000000049f5 R13: 0000000000271000 R14: 00000000000049f5 R15: 0000000000000246 FS: 0000000000000000(0000) GS:ffff883fff4e0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7668601000 CR3: 000000000190a000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff883fff4e3e58 ffffffff81099dc1 0000000000000086 0000000000000071 ffff883fff4f3680 0000000000000071 fbdc8a965e33afee ffffffff810b69dd ffff883fe84ec000 ffff883fe6965108 0000000000000100 ffffffff814a9100 Call Trace: <IRQ> [<ffffffff81099dc1>] ? run_posix_cpu_timers+0x51/0x840 [<ffffffff810b69dd>] ? trigger_load_balance+0x5d/0x200 [<ffffffff814a9100>] ? pid_param_set+0x130/0x130 [<ffffffff8107df56>] call_timer_fn+0x36/0x110 [<ffffffff814a9100>] ? pid_param_set+0x130/0x130 [<ffffffff8107fdcf>] run_timer_softirq+0x21f/0x320 [<ffffffff81077b2f>] __do_softirq+0xef/0x280 [<ffffffff816156dc>] call_softirq+0x1c/0x30 [<ffffffff81015d95>] do_softirq+0x65/0xa0 [<ffffffff81077ec5>] irq_exit+0x115/0x120 [<ffffffff81616355>] smp_apic_timer_interrupt+0x45/0x60 [<ffffffff81614a1d>] apic_timer_interrupt+0x6d/0x80 <EOI> [<ffffffff814a9c32>] ? cpuidle_enter_state+0x52/0xc0 [<ffffffff814a9c28>] ? cpuidle_enter_state+0x48/0xc0 [<ffffffff814a9d65>] cpuidle_idle_call+0xc5/0x200 [<ffffffff8101d14e>] arch_cpu_idle+0xe/0x30 [<ffffffff810c67c1>] cpu_startup_entry+0xf1/0x290 [<ffffffff8104228a>] start_secondary+0x1ba/0x230 Code: 42 0f 00 45 89 e6 48 01 c2 43 8d 44 6d 00 39 d0 73 26 49 c1 e5 08 89 d2 4d 63 f4 49 63 c5 48 c1 e2 08 48 c1 e0 08 48 63 ca 48 99 <48> f7 f9 48 98 4c 0f af f0 49 c1 ee 08 8b 43 78 c1 e0 08 44 29 RIP [<ffffffff814a9279>] intel_pstate_timer_func+0x179/0x3d0 RSP <ffff883fff4e3db8> The kernel values for cpudata for CPU 113 were: struct cpudata { cpu = 113, timer = { entry = { next = 0x0, prev = 0xdead000000200200 }, expires = 8357799745, base = 0xffff883fe84ec001, function = 0xffffffff814a9100 <intel_pstate_timer_func>, data = 18446612406765768960, <snip> i_gain = 0, d_gain = 0, deadband = 0, last_err = 22489 }, last_sample_time = { tv64 = 4063132438017305 }, prev_aperf = 287326796397463, prev_mperf = 251427432090198, sample = { core_pct_busy = 23081, aperf = 2937407, mperf = 3257884, freq = 2524484, time = { tv64 = 4063149215234118 } } } which results in the time between samples = last_sample_time - sample.time = 4063149215234118 - 4063132438017305 = 16777216813 which is 16.777 seconds. The duration between reads of the APERF and MPERF registers overflowed a s32 sized integer in intel_pstate_get_scaled_busy()'s call to div_fp(). The result is that int_tofp(duration_us) == 0, and the kernel attempts to divide by 0. While the kernel shouldn't be delaying for a long time, it can and does happen and the intel_pstate driver should not panic in this situation. This patch changes the div_fp() function to use div64_s64() to allow for "long" division. This will avoid the overflow condition on long delays. [v2]: use div64_s64() in div_fp() Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: qoriq: optimize the CPU frequency switching time	Tang Yuantian	2015-06-15	1	-11/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Each time the CPU switches its frequency, the clock nodes in DTS are walked through to find proper clock source. This is very time-consuming, for example, it is up to 500+ us on T4240. Besides, switching time varies from clock to clock. To optimize this, each input clock of CPU is buffered, so that it can be picked up instantly when needed. Since for each CPU each input clock is stored in a pointer which takes 4 or 8 bytes memory and normally there are several input clocks per CPU, that will not take much memory as well. Signed-off-by: Tang Yuantian <Yuantian.Tang@freescale.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: gx-suspmod: Fix two typos in two comments	Shailendra Verma	2015-06-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Shailendra Verma <shailendra.capricorn@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: nforce2: Fix typo in comment to function nforce2_init()	Shailendra Verma	2015-06-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Shailendra Verma <shailendra.capricorn@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: governor: Serialize governor callbacks	Viresh Kumar	2015-06-15	4	-18/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are several races reported in cpufreq core around governors (only ondemand and conservative) by different people. There are at least two race scenarios present in governor code: (a) Concurrent access/updates of governor internal structures. It is possible that fields such as 'dbs_data->usage_count', etc. are accessed simultaneously for different policies using same governor structure (i.e. CPUFREQ_HAVE_GOVERNOR_PER_POLICY flag unset). And because of this we can dereference bad pointers. For example consider a system with two CPUs with separate 'struct cpufreq_policy' instances. CPU0 governor: ondemand and CPU1: powersave. CPU0 switching to powersave and CPU1 to ondemand: CPU0 CPU1 store* store* cpufreq_governor_exit() cpufreq_governor_init() dbs_data = cdata->gdbs_data; if (!--dbs_data->usage_count) kfree(dbs_data); dbs_data->usage_count++; Bad pointer dereference There are other races possible between EXIT and START/STOP/LIMIT as well. Its really complicated. (b) Switching governor state in bad sequence: For example trying to switch a governor to START state, when the governor is in EXIT state. There are some checks present in __cpufreq_governor() but they aren't sufficient as they compare events against 'policy->governor_enabled', where as we need to take governor's state into account, which can be used by multiple policies. These two issues need to be solved separately and the responsibility should be properly divided between cpufreq and governor core. The first problem is more about the governor core, as it needs to protect its structures properly. And the second problem should be fixed in cpufreq core instead of governor, as its all about sequence of events. This patch is trying to solve only the first problem. There are two types of data we need to protect, - 'struct common_dbs_data': No matter what, there is going to be a single copy of this per governor. - 'struct dbs_data': With CPUFREQ_HAVE_GOVERNOR_PER_POLICY flag set, we will have per-policy copy of this data, otherwise a single copy. Because of such complexities, the mutex present in 'struct dbs_data' is insufficient to solve our problem. For example we need to protect fetching of 'dbs_data' from different structures at the beginning of cpufreq_governor_dbs(), to make sure it isn't currently being updated. This can be fixed if we can guarantee serialization of event parsing code for an individual governor. This is best solved with a mutex per governor, and the placeholder for that is 'struct common_dbs_data'. And so this patch moves the mutex from 'struct dbs_data' to 'struct common_dbs_data' and takes it at the beginning and drops it at the end of cpufreq_governor_dbs(). Tested with and without following configuration options: CONFIG_LOCKDEP_SUPPORT=y CONFIG_DEBUG_RT_MUTEXES=y CONFIG_DEBUG_PI_LIST=y CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y CONFIG_LOCKDEP=y CONFIG_DEBUG_ATOMIC_SLEEP=y Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: governor: split cpufreq_governor_dbs()	Viresh Kumar	2015-06-15	1	-140/+189
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cpufreq_governor_dbs() is hardly readable, it is just too big and complicated. Lets make it more readable by splitting out event specific routines. Order of statements is changed at few places, but that shouldn't bring any functional change. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: governor: register notifier from cs_init()	Viresh Kumar	2015-06-15	4	-38/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Notifiers are required only for conservative governor and the common governor code is unnecessarily polluted with that. Handle that from cs_init/exit() instead of cpufreq_governor_dbs(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Remove cpufreq_update_policy()	Viresh Kumar	2015-06-11	1	-19/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cpufreq_update_policy() was kept as a separate routine earlier as it was handling migration of sysfs directories, which isn't the case anymore. It is only updating policy->cpu now and is called by a single caller. The WARN_ON() isn't really required anymore, as we are just updating the cpu now, not moving the sysfs directories. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Restart governor as soon as possible	Viresh Kumar	2015-06-11	1	-33/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	__cpufreq_remove_dev_finish() is doing two things today: - Restarts the governor if some CPUs from concerned policy are still online. - Frees the policy if all CPUs are offline. The first task of restarting the governor can be moved to __cpufreq_remove_dev_prepare() to restart the governor early. There is no race between _prepare() and _finish() as they would be handling completely different cases. _finish() will only be required if we are going to free the policy and that has nothing to do with restarting the governor. Original-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Call cpufreq_policy_put_kobj() from cpufreq_policy_free()	Viresh Kumar	2015-06-11	1	-11/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cpufreq_policy_put_kobj() is actually part of freeing the policy and can be called from cpufreq_policy_free() directly instead of a separate call. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Initialize policy->kobj while allocating policy	Viresh Kumar	2015-06-11	1	-25/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	policy->kobj is required to be initialized once in the lifetime of a policy. Currently we are initializing it from __cpufreq_add_dev() and that doesn't look to be the best place for doing so as we have to do this on special cases (like: !recover_policy). We can initialize it from a more obvious place cpufreq_policy_alloc() and that will make code look cleaner, specially the error handling part. The error handling part of __cpufreq_add_dev() was doing almost the same thing while recover_policy is true or false. Fix that as well by always calling cpufreq_policy_put_kobj() with an additional parameter to skip notification part of it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Stop migrating sysfs files on hotplug	Viresh Kumar	2015-06-11	1	-85/+135
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we hot-unplug a cpu, we remove its sysfs cpufreq directory and if the outgoing cpu was the owner of policy->kobj earlier then we migrate the sysfs directory to under another online cpu. There are few disadvantages this brings: - Code Complexity - Slower hotplug/suspend/resume - sysfs file permissions are reset after all policy->cpus are offlined - CPUFreq stats history lost after all policy->cpus are offlined - Special management of sysfs stuff during suspend/resume To overcome these, this patch modifies the way sysfs directories are managed: - Select sysfs kobjects owner while initializing policy and don't change it during hotplugs. Track it with kobj_cpu created earlier. - Create symlinks for all related CPUs (can be offline) instead of affected CPUs on policy initialization and remove them only when the policy is freed. - Free policy structure only on the removal of cpufreq-driver and not during hotplug/suspend/resume, detected by checking 'struct subsys_interface *' (Valid only when called from subsys_interface_unregister() while unregistering driver). Apart from this, special care is taken to handle physical hoplug of CPUs as we wouldn't remove sysfs links or remove policies on logical hotplugs. Physical hotplug happens in the following sequence. Hot removal: - CPU is offlined first, ~ 'echo 0 > /sys/devices/system/cpu/cpuX/online' - Then its device is removed along with all sysfs files, cpufreq core notified with cpufreq_remove_dev() callback from subsys-interface.. Hot addition: - First the device along with its sysfs files is added, cpufreq core notified with cpufreq_add_dev() callback from subsys-interface.. - CPU is onlined, ~ 'echo 1 > /sys/devices/system/cpu/cpuX/online' We call the same routines with both hotplug and subsys callbacks, and we sense physical hotplug with cpu_offline() check in subsys callback. We can handle most of the stuff with regular hotplug callback paths and add/remove cpufreq sysfs links or free policy from subsys callbacks. Original-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Don't allow updating inactive policies from sysfs	Viresh Kumar	2015-06-10	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Later commits would change the way policies are managed today. Policies wouldn't be freed on cpu hotplug (currently they aren't freed only for suspend), and while the CPU is offline, the sysfs cpufreq files would still be present. User may accidentally try to update the sysfs files in following directory: '/sys/devices/system/cpu/cpuX/cpufreq/'. And that would result in undefined behavior as policy wouldn't be active then. Apart from updating the store() routine, we also update __cpufreq_get() which can call cpufreq_out_of_sync(). The later routine tries to update policy->cur and starts notifying kernel about it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	intel_pstate: Force setting target pstate when required	Doug Smythies	2015-06-10	1	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During initialization and exit it is possible that the target pstate might not actually be set. Furthermore, the result can be that the driver and the processor are out of synch and, under some conditions, the driver might never send the processor the proper target pstate. This patch adds a bypass or do_checks flag to the call to intel_pstate_set_pstate. If bypass, then specifically bypass clamp checks and the do not send if it is the same as last time check. If do_checks, then, and as before, do the current policy clamp checks, and do not do actual send if the new target is the same as the old. Signed-off-by: Doug Smythies <dsmythies@telus.net> Reported-by: Marien Zwart <marien.zwart@gmail.com> Reported-by: Alex Lochmann <alexander.lochmann@tu-dortmund.de> Reported-by: Piotr Ko?aczkowski <pkolaczk@gmail.com> Reported-by: Clemens Eisserer <linuxhippy@gmail.com> Tested-by: Marien Zwart <marien.zwart@gmail.com> Tested-by: Doug Smythies <dsmythies@telus.net> [ rjw: Dropped pointless symbol definitions, rebased ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	intel_pstate: change some inconsistent debug information	Doug Smythies	2015-06-10	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit ce717613f3fb (intel_pstate: Turn per cpu printk into pr_debug) turned per cpu printk into pr_debug. However, only half of the change was done, introducing an inconsistency between entry and exit from driver pstate control. This patch changes the exit message to pr_debug also. The various messages are inconsistent with respect to any identifier text that can be used to help isolate the desired information from a huge log. This patch makes a consistent identifier portion of the string. Amends: ce717613f3fb (intel_pstate: Turn per cpu printk into pr_debug) Signed-off-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Track cpu managing sysfs kobjects separately	Saravana Kannan	2015-05-23	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to prepare for the next few commits, that will stop migrating sysfs files on cpu hotplug, this patch starts managing sysfs-cpu separately. The behavior is still the same as we are still migrating sysfs files on hotplug, later commits would change that. Signed-off-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Fix for typos in two comments	Shailendra Verma	2015-05-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Shailendra Verma <shailendra.capricorn@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Mark policy->governor = NULL for inactive policies	Viresh Kumar	2015-05-15	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Later commits would change the way policies are managed today. Policies wouldn't be freed on cpu hotplug (currently they aren't freed on suspend), and while the CPU is offline, the sysfs cpufreq files would still be present. Because we don't mark policy->governor as NULL, it still contains pointer of the last used governor. And if the governor is removed, while all the CPUs of a policy are hotplugged out, this pointer wouldn't be valid anymore. And if we try to read the 'scaling_governor', etc. from sysfs, it will result in kernel OOPs. To prevent this, mark policy->governor as NULL for all inactive policies while the governor is removed from kernel. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Manage governor usage history with 'policy->last_governor'	Viresh Kumar	2015-05-15	1	-15/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	History of which governor was used last is common to all CPUs within a policy and maintaining it per-cpu isn't the best approach for sure. Apart from wasting memory, this also increases the complexity of managing this data structure as it has to be updated for all CPUs. To make that somewhat simpler, lets store this information in a new field 'last_governor' in struct cpufreq_policy and update it on removal of last cpu of a policy. As a side-effect it also solves an old problem, consider a system with two clusters 0 & 1. And there is one policy per cluster. Cluster 0: CPU0 and 1. Cluster 1: CPU2 and 3. - CPU2 is first brought online, and governor is set to performance (default as cpufreq_cpu_governor wasn't set). - Governor is changed to ondemand. - CPU2 is taken offline and cpufreq_cpu_governor is updated for CPU2. - CPU3 is brought online. - Because cpufreq_cpu_governor wasn't set for CPU3, the default governor performance is picked for CPU3. This patch fixes the bug as we now have a single variable to update for policy. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Don't traverse all active policies to find policy for a cpu	Viresh Kumar	2015-05-15	1	-9/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We reach here while adding policy for a CPU and enter into the 'if' block only if a policy already exists for the CPU. As cpufreq_cpu_data is set for all policy->related_cpus now, when the policy is first added, we can use that to find the CPU's policy instead of traversing the list of all active policies. Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Get rid of cpufreq_cpu_data_fallback	Viresh Kumar	2015-05-15	1	-19/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can extract the same information from cpufreq_cpu_data as it is also available for inactive policies now. And so don't need cpufreq_cpu_data_fallback anymore. Also add a WARN_ON() for the case where we try to restore from an active policy. Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Don't clear cpufreq_cpu_data and policy list for inactive policies	Viresh Kumar	2015-05-15	1	-45/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we can check policy->cpus to find if policy is active or not, we don't need to clean cpufreq_cpu_data and delete policy from the list on light weight tear down of policies (like in suspend). To make it consistent and clean, set cpufreq_cpu_data for all related CPUs when the policy is first created and clean it only while it is freed. Also update cpufreq_cpu_get_raw() to check if cpu is part of policy->cpus mask, so that we don't end up getting policies for offline CPUs. In order to make sure that no users of 'policy' are using an inactive policy, use cpufreq_cpu_get_raw() instead of directly accessing cpufreq_cpu_data. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Create for_each_{in}active_policy()	Viresh Kumar	2015-05-15	1	-7/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	policy->cpus is cleared unconditionally now on hotplug-out of a CPU and it can be checked to know if a policy is active or not. Create helper routines to iterate over all active/inactive policies, based on policy->cpus field. Replace all instances of for_each_policy() with for_each_active_policy() to make them iterate only for active policies. (We haven't made changes yet to keep inactive policies in the same list, but that will be followed in a later patch). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: arm_big_little: remove compile-time dependency on BIG_LITTLE	Sudeep Holla	2015-05-15	2	-6/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the addition of switcher code, there's compile-time dependency on BIG_LITTLE to get arm_big_little driver compiling on ARM64. Since ARM64 will never add support for bL switcher, it's better to remove the dependency so that the driver can be reused on ARM64 platforms. This patch adds stubs to remove BIG_LITTLE dependency in the driver. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	intel_pstate: set BYT MSR with wrmsrl_on_cpu()	Joe Konno	2015-05-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 007bea098b86 (intel_pstate: Add setting voltage value for baytrail P states.) introduced byt_set_pstate() with the assumption that it would always be run by the CPU whose MSR is to be written by it. It turns out, however, that is not always the case in practice, so modify byt_set_pstate() to enforce the MSR write done by it to always happen on the right CPU. Fixes: 007bea098b86 (intel_pstate: Add setting voltage value for baytrail P states.) Signed-off-by: Joe Konno <joe.konno@intel.com> Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com> Cc: 3.14+ <stable@vger.kernel.org> # 3.14+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Clear policy->cpus even for the last CPU	Viresh Kumar	2015-05-07	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We clear policy->cpus mask while CPUs are hotplugged out. We do it for all CPUs except the last CPU of the policy. I don't remember what the rationale behind that was, but I couldn't think of anything that will break if we remove this conditional clearing and always clear policy->cpus. The benefit we get out of it is, we can know if a policy is active or not by checking if this field is empty or not. That will be used by later commits. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Keep a single path for adding managed CPUs	Viresh Kumar	2015-05-07	1	-7/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two cases when we may try to add CPUs we're already handling: - On boot, the first cpu has marked all policy->cpus managed and so we will find policy for all other policy->cpus later on. - When a managed cpu is hotplugged out and later brought back in. Currently, separate paths and checks take care of the two. While the first one is detected by testing cpu against 'policy->cpus', the other one is detected by testing cpu against 'policy->related_cpus'. We can handle them both via a single path and there is no need to do special checking for the first one. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> [ rjw: Changelog, comments ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Throw warning when we try to get policy for an invalid CPU	Viresh Kumar	2015-05-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Simply returning here with an error is not enough. It shouldn't be allowed at all to try calling cpufreq_cpu_get() for an invalid CPU. Add a WARN here to make it clear that it wouldn't be acceptable at all. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Merge __cpufreq_add_dev() and cpufreq_add_dev()	Viresh Kumar	2015-05-07	1	-17/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cpufreq_add_dev() is an unnecessary wrapper over __cpufreq_add_dev(). Merge them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: Add doc style comment about cpufreq_cpu_{get\|put}()	Viresh Kumar	2015-05-07	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This clearly states what the code inside these routines is doing and how these must be used. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	intel_pstate: Add tsc collection and keep previous target pstate	Doug Smythies	2015-05-05	1	-10/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The intel_pstate driver is difficult to debug and investigate without tsc. Also, it is likely use of tsc, and some version of C0 percentage, will be re-introdcued in futute. There have also been occasions where it is desirebale to know, and confirm, the previous target pstate. This patch brings back tsc, adds previous target pstate, and adds both to the trace data collection. Signed-off-by: Doug Smythies <dsmythies@telus.net> Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: arm_big_little: remove unused cpu-cluster.<n> clock name	Sudeep Holla	2015-05-05	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "cpu-cluster.<n>" used to get the cluster clock is not used by any platform. Moreover __of_clk_get_by_name used in clk_get return error if the "clock-names" in the DT doesn't match this string. When using DT, it's not compulsory to specify the clock name unless there are multiple clock input entries in the consumer. This patch removes the unused clock string from the driver. Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: arm_big_little: check if the frequency is set correctly	Sudeep Holla	2015-05-05	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The actual frequency is set through "clk_change_rate" which is void function. If the underlying hardware fails and returns error, the error is lost in the clk layer. In order to track such failures, we need to read back the frequency(just the cached value as clk_recalc called after clk->ops->set_rate gets the frequency) This patch adds check to see if the frequency is set correctly or if they were any hardware failures and sends the appropriate errors to the cpufreq core. Reviewed-by: Michael Turquette <mike.turquette@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: pxa: make pxa_freqs arrays const	Fabian Frederick	2015-05-05	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pxa255_run_freqs and pxa255_turbo_freqs are only read. This patch updates arrays declaration, find_freq_tables() and its callsites. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| *	cpufreq: pxa: replace typedef pxa_freqs_t by structure	Fabian Frederick	2015-05-05	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	typedef is not really useful here. Replace it by structure to improve readability. typedef should only be used in some cases. (See Documentation/CodingStyle Chapter 5 for details). Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* \|	Merge branch 'x86-core-for-linus' of ↵	Linus Torvalds	2015-06-23	1	-0/+1
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Ingo Molnar: "There were so many changes in the x86/asm, x86/apic and x86/mm topics in this cycle that the topical separation of -tip broke down somewhat - so the result is a more traditional architecture pull request, collected into the 'x86/core' topic. The topics were still maintained separately as far as possible, so bisectability and conceptual separation should still be pretty good - but there were a handful of merge points to avoid excessive dependencies (and conflicts) that would have been poorly tested in the end. The next cycle will hopefully be much more quiet (or at least will have fewer dependencies). The main changes in this cycle were: * x86/apic changes, with related IRQ core changes: (Jiang Liu, Thomas Gleixner) - This is the second and most intrusive part of changes to the x86 interrupt handling - full conversion to hierarchical interrupt domains: [IOAPIC domain] ----- \| [MSI domain] --------[Remapping domain] ----- [ Vector domain ] \| (optional) \| [HPET MSI domain] ----- \| \| [DMAR domain] ----------------------------- \| [Legacy domain] ----------------------------- This now reflects the actual hardware and allowed us to distangle the domain specific code from the underlying parent domain, which can be optional in the case of interrupt remapping. It's a clear separation of functionality and removes quite some duct tape constructs which plugged the remap code between ioapic/msi/hpet and the vector management. - Intel IOMMU IRQ remapping enhancements, to allow direct interrupt injection into guests (Feng Wu) * x86/asm changes: - Tons of cleanups and small speedups, micro-optimizations. This is in preparation to move a good chunk of the low level entry code from assembly to C code (Denys Vlasenko, Andy Lutomirski, Brian Gerst) - Moved all system entry related code to a new home under arch/x86/entry/ (Ingo Molnar) - Removal of the fragile and ugly CFI dwarf debuginfo annotations. Conversion to C will reintroduce many of them - but meanwhile they are only getting in the way, and the upstream kernel does not rely on them (Ingo Molnar) - NOP handling refinements. (Borislav Petkov) * x86/mm changes: - Big PAT and MTRR rework: making the code more robust and preparing to phase out exposing direct MTRR interfaces to drivers - in favor of using PAT driven interfaces (Toshi Kani, Luis R Rodriguez, Borislav Petkov) - New ioremap_wt()/set_memory_wt() interfaces to support Write-Through cached memory mappings. This is especially important for good performance on NVDIMM hardware (Toshi Kani) * x86/ras changes: - Add support for deferred errors on AMD (Aravind Gopalakrishnan) This is an important RAS feature which adds hardware support for poisoned data. That means roughly that the hardware marks data which it has detected as corrupted but wasn't able to correct, as poisoned data and raises an APIC interrupt to signal that in the form of a deferred error. It is the OS's responsibility then to take proper recovery action and thus prolonge system lifetime as far as possible. - Add support for Intel "Local MCE"s: upcoming CPUs will support CPU-local MCE interrupts, as opposed to the traditional system- wide broadcasted MCE interrupts (Ashok Raj) - Misc cleanups (Borislav Petkov) * x86/platform changes: - Intel Atom SoC updates ... and lots of other cleanups, fixlets and other changes - see the shortlog and the Git log for details" * 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (222 commits) x86/hpet: Use proper hpet device number for MSI allocation x86/hpet: Check for irq==0 when allocating hpet MSI interrupts x86/mm/pat, drivers/infiniband/ipath: Use arch_phys_wc_add() and require PAT disabled x86/mm/pat, drivers/media/ivtv: Use arch_phys_wc_add() and require PAT disabled x86/platform/intel/baytrail: Add comments about why we disabled HPET on Baytrail genirq: Prevent crash in irq_move_irq() genirq: Enhance irq_data_to_desc() to support hierarchy irqdomain iommu, x86: Properly handle posted interrupts for IOMMU hotplug iommu, x86: Provide irq_remapping_cap() interface iommu, x86: Setup Posted-Interrupts capability for Intel iommu iommu, x86: Add cap_pi_support() to detect VT-d PI capability iommu, x86: Avoid migrating VT-d posted interrupts iommu, x86: Save the mode (posted or remapped) of an IRTE iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip iommu: dmar: Provide helper to copy shared irte fields iommu: dmar: Extend struct irte for VT-d Posted-Interrupts iommu: Add new member capability to struct irq_remap_ops x86/asm/entry/64: Disentangle error_entry/exit gsbase/ebx/usermode code x86/asm/entry/32: Shorten __audit_syscall_entry() args preparation x86/asm/entry/32: Explain reloading of registers after __audit_syscall_entry() ...
\| * \|	x86/mm: Decouple <linux/vmalloc.h> from <asm/io.h>	Stephen Rothwell	2015-06-03	1	-0/+1
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Nothing in <asm/io.h> uses anything from <linux/vmalloc.h>, so remove it from there and fix up the resulting build problems triggered on x86 {64\|32}-bit {def\|allmod\|allno}configs. The breakages were triggering in places where x86 builds relied on vmalloc() facilities but did not include <linux/vmalloc.h> explicitly and relied on the implicit inclusion via <asm/io.h>. Also add: - <linux/init.h> to <linux/io.h> - <asm/pgtable_types> to <asm/io.h> ... which were two other implicit header file dependencies. Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> [ Tidied up the changelog. ] Acked-by: David Miller <davem@davemloft.net> Acked-by: Takashi Iwai <tiwai@suse.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Vinod Koul <vinod.koul@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Colin Cross <ccross@android.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: James E.J. Bottomley <JBottomley@odin.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Kees Cook <keescook@chromium.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Kristen Carlson Accardi <kristen@linux.intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Suma Ramars <sramars@cisco.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
* \|	speedstep-ich: Replace cpu_sibling_mask() with topology_sibling_cpumask()	Bartosz Golaszewski	2015-05-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The former duplicates the functionality of the latter but is neither documented nor arch-independent. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Benoit Cousson <bcousson@baylibre.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jean Delvare <jdelvare@suse.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1432645896-12588-8-git-send-email-bgolaszewski@baylibre.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* \|	acpi-cpufreq: Replace cpu__mask() with topology__cpumask()	Bartosz Golaszewski	2015-05-27	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The former duplicate the functionalities of the latter but are neither documented nor arch-independent. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Benoit Cousson <bcousson@baylibre.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jean Delvare <jdelvare@suse.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1432645896-12588-7-git-send-email-bgolaszewski@baylibre.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* \|	p4-clockmod: Replace cpu_sibling_mask() with topology_sibling_cpumask()	Bartosz Golaszewski	2015-05-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The former duplicates the functionality of the latter but is neither documented nor arch-independent. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Benoit Cousson <bcousson@baylibre.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jean Delvare <jdelvare@suse.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1432645896-12588-6-git-send-email-bgolaszewski@baylibre.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* \|	powernow-k8: Replace cpu_core_mask() with topology_core_cpumask()	Bartosz Golaszewski	2015-05-27	1	-10/+3
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The former duplicates the functionality of the latter but is neither documented nor arch-independent. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Benoit Cousson <bcousson@baylibre.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jean Delvare <jdelvare@suse.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1432645896-12588-5-git-send-email-bgolaszewski@baylibre.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
*	cpufreq: intel_pstate: Fix an annoying !CONFIG_SMP warning	Borislav Petkov	2015-04-15	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I keep seeing drivers/cpufreq/intel_pstate.c: In function ‘intel_pstate_init’: drivers/cpufreq/intel_pstate.c:1187:26: warning: initialization from incompatible pointer type struct cpuinfo_x86 c = &boot_cpu_data; when doing randconfig builds. This is caused by the fact that when !CONFIG_SMP, asm/processor.h defines cpu_info to boot_cpu_data and the local variable struct cpu_defaults cpu_info overshadows it leading to this unfortunate assignment in the preprocessed source: struct cpu_defaults boot_cpu_data; struct cpuinfo_x86 c = &boot_cpu_data; Rename the local variable and use static_cpu_has_safe() which alleviates the need for defining a local cpuinfo_x86 pointer. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
*	intel_pstate: Change the setpoint for Atom params	Kristen Carlson Accardi	2015-04-15	1	-1/+1
\| \| \| \| \| \| \| \| \|	Change the setpoint for the Baytrail and Cherrytrail CPUs. This will cause more aggressive pstate selection and improves performance on a variety of workloads with little power penalty. Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
*	intel_pstate: Knights Landing support	Dasaratharaman Chandramouli	2015-04-11	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \|	1. Add Knights Landing (KNL) CPUID to the list of CPUIDs supported by the intel_pstate driver. 2. Add a new cpu_default structure for KNL since KNL has a slightly different mechanism to get turbo pstates from MSRs. Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> [ rjw: Subject and changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
*	intel_pstate: remove MSR test	Kristen Carlson Accardi	2015-04-11	1	-14/+0
\| \| \| \| \| \| \| \|	x86_match_cpu will not match our cpuid unless APERF/MPERF flag is set, so there is no need to do the manual check for this MSR. Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
*	cpufreq: fix qoriq uniprocessor build	Arnd Bergmann	2015-04-11	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The qoriq-cpufreq driver contains a hack for powerpc to include asm/smp.h on uniprocessor builds so it can get the hardware CPU number. On ARM, it does not require this hack, but instead gets a compile error: In file included from drivers/cpufreq/qoriq-cpufreq.c:24:0: arch/arm/include/asm/smp.h:18:3: error: #error "<asm/smp.h> included in non-SMP build" arch/arm/include/asm/smp.h:21:0: warning: "raw_smp_processor_id" redefined This adds an #ifdef to mirror the one in its get_cpu_physical_id() function. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 2f249358eedaf ("cpufreq: qoriq: rename the driver") Cc: Tang Yuantian <Yuantian.Tang@freescale.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
*	Merge back earlier cpufreq material for v4.1.	Rafael J. Wysocki	2015-04-10	7	-73/+206
\|\
\| *	cpufreq: hisilicon: add acpu driver	Leo Yan	2015-04-02	3	-0/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add acpu driver for hisilicon SoC, acpu is application processor subsystem. Currently the acpu has the coupled clock domain for two clusters, so this driver will directly use cpufreq-dt driver as backend. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>