summaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'x86-timers-for-linus' of ↵Linus Torvalds2016-12-185-23/+331
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "This is the last functional update from the tip tree for 4.10. It got delayed due to a newly reported and anlyzed variant of BIOS bug and the resulting wreckage: - Seperation of TSC being marked realiable and the fact that the platform provides the TSC frequency via CPUID/MSRs and making use for it for GOLDMONT. - TSC adjust MSR validation and sanitizing: The TSC adjust MSR contains the offset to the hardware counter. The sum of the adjust MSR and the counter is the TSC value which is read via RDTSC. On at least two machines from different vendors the BIOS sets the TSC adjust MSR to negative values. This happens on cold and warm boot. While on cold boot the offset is a few milliseconds, on warm boot it basically compensates the power on time of the system. The BIOSes are not even using the adjust MSR to set all CPUs in the package to the same offset. The offsets are different which renders the TSC unusable, What's worse is that the TSC deadline timer has a HW feature^Wbug. It malfunctions when the TSC adjust value is negative or greater equal 0x80000000 resulting in silent boot failures, hard lockups or non firing timers. This looks like some hardware internal 32/64bit issue with a sign extension problem. Intel has been silent so far on the issue. The update contains sanity checks and keeps the adjust register within working limits and in sync on the package. As it looks like this disease is spreading via BIOS crapware, we need to address this urgently as the boot failures are hard to debug for users" * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tsc: Limit the adjust value further x86/tsc: Annotate printouts as firmware bug x86/tsc: Force TSC_ADJUST register to value >= zero x86/tsc: Validate TSC_ADJUST after resume x86/tsc: Validate cpumask pointer before accessing it x86/tsc: Fix broken CONFIG_X86_TSC=n build x86/tsc: Try to adjust TSC if sync test fails x86/tsc: Prepare warp test for TSC adjustment x86/tsc: Move sync cleanup to a safe place x86/tsc: Sync test only for the first cpu in a package x86/tsc: Verify TSC_ADJUST from idle x86/tsc: Store and check TSC ADJUST MSR x86/tsc: Detect random warps x86/tsc: Use X86_FEATURE_TSC_ADJUST in detect_art() x86/tsc: Finalize the split of the TSC_RELIABLE flag x86/tsc: Set TSC_KNOWN_FREQ and TSC_RELIABLE flags on Intel Atom SoCs x86/tsc: Mark Intel ATOM_GOLDMONT TSC reliable x86/tsc: Mark TSC frequency determined by CPUID as known x86/tsc: Add X86_FEATURE_TSC_KNOWN_FREQ flag
| * x86/tsc: Limit the adjust value furtherThomas Gleixner2016-12-181-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adjust value 0x80000000 and other values larger than that render the TSC deadline timer disfunctional. We have not yet any information about this from Intel, but experimentation clearly proves that this is a 32/64 bit and sign extension issue. If adjust values larger than that are actually required, which might be the case for physical CPU hotplug, then we need to disable the deadline timer on the affected package/CPUs and use the local APIC timer instead. That requires some surgery in the APIC setup code, so we just limit the ADJUST register value into the known to work range for now and revisit this when Intel comes forth with proper information. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Roland Scheidegger <rscheidegger_lists@hispeed.ch> Cc: Bruce Schlobohm <bruce.schlobohm@intel.com> Cc: Kevin Stanton <kevin.b.stanton@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de>
| * x86/tsc: Annotate printouts as firmware bugThomas Gleixner2016-12-181-2/+3
| | | | | | | | | | | | | | | | | | | | | | Make it more obvious that the BIOS is screwed up. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Roland Scheidegger <rscheidegger_lists@hispeed.ch> Cc: Bruce Schlobohm <bruce.schlobohm@intel.com> Cc: Kevin Stanton <kevin.b.stanton@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de>
| * x86/tsc: Force TSC_ADJUST register to value >= zeroThomas Gleixner2016-12-152-17/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Roland reported that his DELL T5810 sports a value add BIOS which completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with random negative TSC_ADJUST values, different on all CPUs. That renders the TSC useless because the sycnchronization check fails. Roland tested the new TSC_ADJUST mechanism. While it manages to readjust the TSCs he needs to disable the TSC deadline timer, otherwise the machine just stops booting. Deeper investigation unearthed that the TSC deadline timer is sensitive to the TSC_ADJUST value. Writing TSC_ADJUST to a negative value results in an interrupt storm caused by the TSC deadline timer. This does not make any sense and it's hard to imagine what kind of hardware wreckage is behind that misfeature, but it's reliably reproducible on other systems which have TSC_ADJUST and TSC deadline timer. While it would be understandable that a big enough negative value which moves the resulting TSC readout into the negative space could have the described effect, this happens even with a adjust value of -1, which keeps the TSC readout definitely in the positive space. The compare register for the TSC deadline timer is set to a positive value larger than the TSC, but despite not having reached the deadline the interrupt is raised immediately. If this happens on the boot CPU, then the machine dies silently because this setup happens before the NMI watchdog is armed. Further experiments showed that any other adjustment of TSC_ADJUST works as expected as long as it stays in the positive range. The direction of the adjustment has no influence either. See the lkml link for further analysis. Yet another proof for the theory that timers are designed by janitors and the underlying (obviously undocumented) mechanisms which allow BIOSes to wreckage them are considered a feature. Well done Intel - NOT! To address this wreckage add the following sanity measures: - If the TSC_ADJUST value on the boot cpu is not 0, set it to 0 - If the TSC_ADJUST value on any cpu is negative, set it to 0 - Prevent the cross package synchronization mechanism from setting negative TSC_ADJUST values. Reported-and-tested-by: Roland Scheidegger <rscheidegger_lists@hispeed.ch> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bruce Schlobohm <bruce.schlobohm@intel.com> Cc: Kevin Stanton <kevin.b.stanton@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Allen Hung <allen_hung@dell.com> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161213131211.397588033@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Validate TSC_ADJUST after resumeThomas Gleixner2016-12-153-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some 'feature' BIOSes fiddle with the TSC_ADJUST register during suspend/resume which renders the TSC unusable. Add sanity checks into the resume path and restore the original value if it was adjusted. Reported-and-tested-by: Roland Scheidegger <rscheidegger_lists@hispeed.ch> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bruce Schlobohm <bruce.schlobohm@intel.com> Cc: Kevin Stanton <kevin.b.stanton@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Allen Hung <allen_hung@dell.com> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161213131211.317654500@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Validate cpumask pointer before accessing itThomas Gleixner2016-12-011-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 0-day testing encountered a NULL pointer dereference in a cpumask access from tsc_store_and_check_tsc_adjust(). This happens when the function is called on the boot CPU and the topology masks are not yet available due to CPUMASK_OFFSTACK=y. Add a NULL pointer check for the mask pointer. If NULL it's safe to assume that the CPU is the boot CPU and the first one in the package. Fixes: 8b223bc7abe0 ("x86/tsc: Store and check TSC ADJUST MSR") Reported-by: kernel test robot <xiaolong.ye@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Fix broken CONFIG_X86_TSC=n buildThomas Gleixner2016-11-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Add the missing return statement to the inline stub tsc_store_and_check_tsc_adjust() and add the other stubs to make a SMP=y,TSC=n build happy. While at it, remove the unused variable from the UP variant of tsc_store_and_check_tsc_adjust(). Fixes: commit ba75fb646931 ("x86/tsc: Sync test only for the first cpu in a package") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Try to adjust TSC if sync test failsThomas Gleixner2016-11-291-5/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the first CPU of a package comes online, it is necessary to test whether the TSC is in sync with a CPU on some other package. When a deviation is observed (time going backwards between the two CPUs) the TSC is marked unstable, which is a problem on large machines as they have to fall back to the HPET clocksource, which is insanely slow. It has been attempted to compensate the TSC by adding the offset to the TSC and writing it back some time ago, but this never was merged because it did not turn out to be stable, especially not on older systems. Modern systems have become more stable in that regard and the TSC_ADJUST MSR allows us to compensate for the time deviation in a sane way. If it's available allow up to three synchronization runs and if a time warp is detected the starting CPU can compensate the time warp via the TSC_ADJUST MSR and retry. If the third run still shows a deviation or when random time warps are detected the test terminally fails. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161119134018.048237517@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Prepare warp test for TSC adjustmentThomas Gleixner2016-11-291-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To allow TSC compensation cross nodes its necessary to know in which direction the TSC warp was observed. Return the maximum observed value on the calling CPU so the caller can determine the direction later. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161119134017.970859287@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Move sync cleanup to a safe placeThomas Gleixner2016-11-291-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Cleaning up the stop marker on the control CPU is wrong when we want to add retry support. Move the cleanup to the starting CPU. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161119134017.892095627@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Sync test only for the first cpu in a packageThomas Gleixner2016-11-291-9/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the TSC_ADJUST MSR is available all CPUs in a package are forced to the same value. So TSCs cannot be out of sync when the first CPU in the package was in sync. That allows to skip the sync test for all CPUs except the first starting CPU in a package. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161119134017.809901363@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Verify TSC_ADJUST from idleThomas Gleixner2016-11-292-2/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When entering idle, it's a good oportunity to verify that the TSC_ADJUST MSR has not been tampered with (BIOS hiding SMM cycles). If tampering is detected, emit a warning and restore it to the previous value. This is especially important for machines, which mark the TSC reliable because there is no watchdog clocksource available (SoCs). This is not sufficient for HPC (NOHZ_FULL) situations where a CPU never goes idle, but adding a timer to do the check periodically is not an option either. On a machine, which has this issue, the check triggeres right during boot, so there is a decent chance that the sysadmin will notice. Rate limit the check to once per second and warn only once per cpu. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161119134017.732180441@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Store and check TSC ADJUST MSRThomas Gleixner2016-11-293-1/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The TSC_ADJUST MSR shows whether the TSC has been modified. This is helpful in a two aspects: 1) It allows to detect BIOS wreckage, where SMM code tries to 'hide' the cycles spent by storing the TSC value at SMM entry and restoring it at SMM exit. On affected machines the TSCs run slowly out of sync up to the point where the clocksource watchdog (if available) detects it. The TSC_ADJUST MSR allows to detect the TSC modification before that and eventually restore it. This is also important for SoCs which have no watchdog clocksource and therefore TSC wreckage cannot be detected and acted upon. 2) All threads in a package are required to have the same TSC_ADJUST value. Broken BIOSes break that and as a result the TSC synchronization check fails. The TSC_ADJUST MSR allows to detect the deviation when a CPU comes online. If detected set it to the value of an already online CPU in the same package. This also allows to reduce the number of sync tests because with that in place the test is only required for the first CPU in a package. In principle all CPUs in a system should have the same TSC_ADJUST value even across packages, but with physical CPU hotplug this assumption is not true because the TSC starts with power on, so physical hotplug has to do some trickery to bring the TSC into sync with already running packages, which requires to use an TSC_ADJUST value different from CPUs which got powered earlier. A final enhancement is the opportunity to compensate for unsynced TSCs accross nodes at boot time and make the TSC usable that way. It won't help for TSCs which run apart due to frequency skew between packages, but this gets detected by the clocksource watchdog later. The first step toward this is to store the TSC_ADJUST value of a starting CPU and compare it with the value of an already online CPU in the same package. If they differ, emit a warning and adjust it to the reference value. The !SMP version just stores the boot value for later verification. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161119134017.655323776@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Detect random warpsThomas Gleixner2016-11-291-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If time warps can be observed then they should only ever be observed on one CPU. If they are observed on both CPUs then the system is completely hosed. Add a check for this condition and notify if it happens. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161119134017.574838461@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Use X86_FEATURE_TSC_ADJUST in detect_art()Thomas Gleixner2016-11-291-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The art detection uses rdmsrl_safe() to detect the availablity of the TSC_ADJUST MSR. That's pointless because we have a feature bit for this. Use it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161119134017.483561692@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Finalize the split of the TSC_RELIABLE flagThomas Gleixner2016-11-181-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All places which used the TSC_RELIABLE to skip the delayed calibration have been converted to use the TSC_KNOWN_FREQ flag. Make the immeditate clocksource registration, which skips the long term calibration, solely depend on TSC_KNOWN_FREQ. The TSC_RELIABLE now merily removes the requirement for a watchdog clocksource. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bin Gao <bin.gao@intel.com> Cc: Peter Zijlstra <peterz@infradead.org>
| * x86/tsc: Set TSC_KNOWN_FREQ and TSC_RELIABLE flags on Intel Atom SoCsBin Gao2016-11-181-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TSC on Intel Atom SoCs capable of determining TSC frequency by MSR is reliable and the frequency is known (provided by HW). On these platforms PIT/HPET is generally not available so calibration won't work at all and there is no other clocksource to act as a watchdog for the TSC, so we have no other choice than to trust it. Set both X86_FEATURE_TSC_KNOWN_FREQ and X86_FEATURE_TSC_RELIABLE flags to make sure the calibration is skipped and no watchdog is required. Signed-off-by: Bin Gao <bin.gao@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1479241644-234277-5-git-send-email-bin.gao@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Mark Intel ATOM_GOLDMONT TSC reliableBin Gao2016-11-181-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Intel GOLDMONT Atom SoC TSC is the only available clocksource, so there is no way to do software calibration or have a watchdog clocksource for it. Software calibration is already disabled via the TSC_KNOWN_FREQ flag, but the watchdog requirement still persists, so such systems cannot switch to high resolution/nohz mode. Mark it reliable, so it becomes usable. Hardware teams confirmed that this is safe on that SoC. Signed-off-by: Bin Gao <bin.gao@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1479241644-234277-4-git-send-email-bin.gao@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Mark TSC frequency determined by CPUID as knownBin Gao2016-11-181-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CPUs/SoCs with CPUID leaf 0x15 come with a known frequency and will report the frequency to software via CPUID instruction. This hardware provided frequency is the "real" frequency of TSC. Set the X86_FEATURE_TSC_KNOWN_FREQ flag for such systems to skip the software calibration process. A 24 hours test on one of the CPUID 0x15 capable platforms was conducted. PIT calibrated frequency resulted in more than 3 seconds drift whereas the CPUID determined frequency showed less than 0.5 second drift. Signed-off-by: Bin Gao <bin.gao@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1479241644-234277-3-git-send-email-bin.gao@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86/tsc: Add X86_FEATURE_TSC_KNOWN_FREQ flagBin Gao2016-11-181-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The X86_FEATURE_TSC_RELIABLE flag in Linux kernel implies both reliable (at runtime) and trustable (at calibration). But reliable running and trustable calibration independent of each other. Add a new flag X86_FEATURE_TSC_KNOWN_FREQ, which denotes that the frequency is known (via MSR/CPUID). This flag is only meant to skip the long term calibration on systems which have a known frequency. Add X86_FEATURE_TSC_KNOWN_FREQ to the skip the delayed calibration and leave X86_FEATURE_TSC_RELIABLE in place. After converting the existing users of X86_FEATURE_TSC_RELIABLE to use either both flags or just X86_FEATURE_TSC_KNOWN_FREQ we can seperate the functionality. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Bin Gao <bin.gao@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1479241644-234277-2-git-send-email-bin.gao@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2016-12-185-70/+35
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes and cleanups from Thomas Gleixner: "This set of updates contains: - Robustification for the logical package managment. Cures the AMD and virtualization issues. - Put the correct start_cpu() return address on the stack of the idle task. - Fixups for the fallout of the nodeid <-> cpuid persistent mapping modifciations - Move the x86/MPX specific mm_struct member to the arch specific mm_context where it belongs - Cleanups for C89 struct initializers and useless function arguments" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/floppy: Use designated initializers x86/mpx: Move bd_addr to mm_context_t x86/mm: Drop unused argument 'removed' from sync_global_pgds() ACPI/NUMA: Do not map pxm to node when NUMA is turned off x86/acpi: Use proper macro for invalid node x86/smpboot: Prevent false positive out of bounds cpumask access warning x86/boot/64: Push correct start_cpu() return address x86/boot/64: Use 'push' instead of 'call' in start_cpu() x86/smpboot: Make logical package management more robust
| * | x86/acpi: Use proper macro for invalid nodeBoris Ostrovsky2016-12-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use NUMA_NO_NODE instead of -1. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: len.brown@intel.com Cc: rjw@rjwysocki.net Cc: linux-acpi@vger.kernel.org Cc: pavel@ucw.cz Link: http://lkml.kernel.org/r/1481570993-13941-1-git-send-email-boris.ostrovsky@oracle.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | x86/smpboot: Prevent false positive out of bounds cpumask access warningThomas Gleixner2016-12-151-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | prefill_possible_map() reinitializes the cpu_possible_map by setting the possible cpu bits and clearing all other bits up to NR_CPUS. This is technically always correct because cpu_possible_map is statically allocated and sized NR_CPUS. With CPUMASK_OFFSTACK and DEBUG_PER_CPU_MAPS enabled the bounds check of cpu masks happens on nr_cpu_ids. nr_cpu_ids is initialized to NR_CPUS and only limited after the set/clear bit loops have been executed. But if the system was booted with "nr_cpus=N" on the command line, where N is < NR_CPUS then nr_cpu_ids is limited in the parameter parsing function before prefill_possible_map() is invoked. As a consequence the cpumask bounds check triggers when clearing the bits past nr_cpu_ids. Add a helper which allows to reset cpu_possible_map w/o the bounds check and then set only the possible bits which are well inside bounds. Reported-by: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: 0x7f454c46@gmail.com Cc: Jan Beulich <JBeulich@novell.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1612131836050.3415@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | x86/boot/64: Push correct start_cpu() return addressJosh Poimboeuf2016-12-141-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | start_cpu() pushes a text address on the stack so that stack traces from idle tasks will show start_cpu() at the end. But it currently shows the wrong function offset. It's more correct to show the address immediately after the 'lretq' instruction. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/2cadd9f16c77da7ee7957bfc5e1c67928c23ca48.1481685203.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/boot/64: Use 'push' instead of 'call' in start_cpu()Josh Poimboeuf2016-12-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | start_cpu() pushes a text address on the stack so that stack traces from idle tasks will show start_cpu() at the end. But it uses a call instruction to do that, which is rather obtuse. Use a straightforward push instead. Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4d8a1952759721d42d1e62ba9e4a7e3ac5df8574.1481685203.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/smpboot: Make logical package management more robustThomas Gleixner2016-12-133-63/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The logical package management has several issues: - The APIC ids provided by ACPI are not required to be the same as the initial APIC id which can be retrieved by CPUID. The APIC ids provided by ACPI are those which are written by the BIOS into the APIC. The initial id is set by hardware and can not be changed. The hardware provided ids contain the real hardware package information. Especially AMD sets the effective APIC id different from the hardware id as they need to reserve space for the IOAPIC ids starting at id 0. As a consequence those machines trigger the currently active firmware bug printouts in dmesg, These are obviously wrong. - Virtual machines have their own interesting of enumerating APICs and packages which are not reliably covered by the current implementation. The sizing of the mapping array has been tweaked to be generously large to handle systems which provide a wrong core count when HT is disabled so the whole magic which checks for space in the physical hotplug case is not needed anymore. Simplify the whole machinery and do the mapping when the CPU starts and the CPUID derived physical package information is available. This solves the observed problems on AMD machines and works for the virtualization issues as well. Remove the extra call from XEN cpu bringup code as it is not longer required. Fixes: d49597fd3bc7 ("x86/cpu: Deal with broken firmware (VMWare/XEN)") Reported-and-tested-by: Borislav Petkov <bp@suse.de> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: M. Vefa Bicakci <m.v.b@runbox.com> Cc: xen-devel <xen-devel@lists.xen.org> Cc: Charles (Chas) Williams <ciwillia@brocade.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Alok Kataria <akataria@vmware.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1612121102260.3429@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | | Merge tag 'powerpc-4.10-1' of ↵Linus Torvalds2016-12-162-40/+45
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights include: - Support for the kexec_file_load() syscall, which is a prereq for secure and trusted boot. - Prevent kernel execution of userspace on P9 Radix (similar to SMEP/PXN). - Sort the exception tables at build time, to save time at boot, and store them as relative offsets to save space in the kernel image & memory. - Allow building the kernel with thin archives, which should allow us to build an allyesconfig once some other fixes land. - Build fixes to allow us to correctly rebuild when changing the kernel endian from big to little or vice versa. - Plumbing so that we can avoid doing a full mm TLB flush on P9 Radix. - Initial stack protector support (-fstack-protector). - Support for dumping the radix (aka. Linux) and hash page tables via debugfs. - Fix an oops in cxl coredump generation when cxl_get_fd() is used. - Freescale updates from Scott: "Highlights include 8xx hugepage support, qbman fixes/cleanup, device tree updates, and some misc cleanup." - Many and varied fixes and minor enhancements as always. Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual, Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz, Christophe Jaillet, Christophe Leroy, Denis Kirjanov, Elimar Riesebieter, Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff Levand, Jack Miller, Johan Hovold, Lars-Peter Clausen, Libin, Madhavan Srinivasan, Michael Neuling, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin, Rashmica Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain" [ And thanks to Michael, who took time off from a new baby to get this pull request done. - Linus ] * tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (174 commits) powerpc/fsl/dts: add FMan node for t1042d4rdb powerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdb powerpc/fsl/dts: add QMan and BMan nodes on t1024 powerpc/fsl/dts: add QMan and BMan nodes on t1023 soc/fsl/qman: test: use DEFINE_SPINLOCK() powerpc/fsl-lbc: use DEFINE_SPINLOCK() powerpc/8xx: Implement support of hugepages powerpc: get hugetlbpage handling more generic powerpc: port 64 bits pgtable_cache to 32 bits powerpc/boot: Request no dynamic linker for boot wrapper soc/fsl/bman: Use resource_size instead of computation soc/fsl/qe: use builtin_platform_driver powerpc/fsl_pmc: use builtin_platform_driver powerpc/83xx/suspend: use builtin_platform_driver powerpc/ftrace: Fix the comments for ftrace_modify_code powerpc/perf: macros for power9 format encoding powerpc/perf: power9 raw event format encoding powerpc/perf: update attribute_group data structure powerpc/perf: factor out the event format field powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown ...
| * | | kexec_file: Change kexec_add_buffer to take kexec_buf as argument.Thiago Jung Bauermann2016-11-302-40/+45
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is done to simplify the kexec_add_buffer argument list. Adapt all callers to set up a kexec_buf to pass to kexec_add_buffer. In addition, change the type of kexec_buf.buffer from char * to void *. There is no particular reason for it to be a char *, and the change allows us to get rid of 3 existing casts to char * in the code. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | Merge tag 'trace-v4.10' of ↵Linus Torvalds2016-12-151-1/+2
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "This release has a few updates: - STM can hook into the function tracer - Function filtering now supports more advance glob matching - Ftrace selftests updates and added tests - Softirq tag in traces now show only softirqs - ARM nop added to non traced locations at compile time - New trace_marker_raw file that allows for binary input - Optimizations to the ring buffer - Removal of kmap in trace_marker - Wakeup and irqsoff tracers now adhere to the set_graph_notrace file - Other various fixes and clean ups" * tag 'trace-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (42 commits) selftests: ftrace: Shift down default message verbosity kprobes/trace: Fix kprobe selftest for newer gcc tracing/kprobes: Add a helper method to return number of probe hits tracing/rb: Init the CPU mask on allocation tracing: Use SOFTIRQ_OFFSET for softirq dectection for more accurate results tracing/fgraph: Have wakeup and irqsoff tracers ignore graph functions too fgraph: Handle a case where a tracer ignores set_graph_notrace tracing: Replace kmap with copy_from_user() in trace_marker writing ftrace/x86_32: Set ftrace_stub to weak to prevent gcc from using short jumps to it tracing: Allow benchmark to be enabled at early_initcall() tracing: Have system enable return error if one of the events fail tracing: Do not start benchmark on boot up tracing: Have the reg function allow to fail ring-buffer: Force rb_end_commit() and rb_set_commit_to_write() inline ring-buffer: Froce rb_update_write_stamp() to be inlined ring-buffer: Force inline of hotpath helper functions tracing: Make __buffer_unlock_commit() always_inline tracing: Make tracepoint_printk a static_key ring-buffer: Always inline rb_event_data() ring-buffer: Make rb_reserve_next_event() always inlined ...
| * | | tracing: Have the reg function allow to failSteven Rostedt (Red Hat)2016-12-091-1/+2
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some tracepoints have a registration function that gets enabled when the tracepoint is enabled. There may be cases that the registraction function must fail (for example, can't allocate enough memory). In this case, the tracepoint should also fail to register, otherwise the user would not know why the tracepoint is not working. Cc: David Howells <dhowells@redhat.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: Anton Blanchard <anton@samba.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | | kexec: export the value of phys_base instead of symbol addressBaoquan He2016-12-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently in x86_64, the symbol address of phys_base is exported to vmcoreinfo. Dave Anderson complained this is really useless for his Crash implementation. Because in user-space utility Crash and Makedumpfile which exported vmcore information is mainly used for, value of phys_base is needed to covert virtual address of exported kernel symbol to physical address. Especially init_level4_pgt, if we want to access and go over the page table to look up a PA corresponding to VA, firstly we need calculate page_dir = SYMBOL(init_level4_pgt) - __START_KERNEL_map + phys_base; Now in Crash and Makedumpfile, we have to analyze the vmcore elf program header to get value of phys_base. As Dave said, it would be preferable if it were readily availabl in vmcoreinfo rather than depending upon the PT_LOAD semantics. Hence in this patch change to export the value of phys_base instead of its virtual address. And people also complained that KERNEL_IMAGE_SIZE exporting is x86_64 only, should be moved into arch dependent function arch_crash_save_vmcoreinfo. Do the moving in this patch. Link: http://lkml.kernel.org/r/1478568596-30060-2-git-send-email-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Thomas Garnier <thgarnie@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Xunlei Pang <xlpang@redhat.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Kees Cook <keescook@chromium.org> Cc: Eugene Surovegin <surovegin@google.com> Cc: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com> Cc: Dave Anderson <anderson@redhat.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Revert "kdump, vmcoreinfo: report memory sections virtual addresses"Baoquan He2016-12-151-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 0549a3c02efb ("kdump, vmcoreinfo: report memory sections virtual addresses"). Commit 0549a3c02efb tells the userspace utility makedumpfile the randomized base address of these memmory sections when mm kaslr is enabled. However the following patch "kexec: export the value of phys_base instead of symbol address" makes makedumpfile not need these addresses any more. Besides we should use VMCOREINFO_NUMBER to export the value of the variable so that we can use the existing number_table mechanism of Makedumpfile to fetch it. So revert it now. If needed we can add it later. http://lists.infradead.org/pipermail/kexec/2016-October/017540.html Link: http://lkml.kernel.org/r/1478568596-30060-1-git-send-email-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Thomas Garnier <thgarnie@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Xunlei Pang <xlpang@redhat.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Kees Cook <keescook@chromium.org> Cc: Eugene Surovegin <surovegin@google.com> Cc: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com> Cc: Dave Anderson <anderson@redhat.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds2016-12-131-1/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull workqueue updates from Tejun Heo: "Mostly patches to initialize workqueue subsystem earlier and get rid of keventd_up(). The patches were headed for the last merge cycle but got delayed due to a bug found late minute, which is fixed now. Also, to help debugging, destroy_workqueue() is more chatty now on a sanity check failure." * 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: move wq_numa_init() to workqueue_init() workqueue: remove keventd_up() debugobj, workqueue: remove keventd_up() usage slab, workqueue: remove keventd_up() usage power, workqueue: remove keventd_up() usage tty, workqueue: remove keventd_up() usage mce, workqueue: remove keventd_up() usage workqueue: make workqueue available early during boot workqueue: dump workqueue state on sanity check failures in destroy_workqueue()
| * \ \ Merge branch 'for-4.9' into for-4.10Tejun Heo2016-10-191-1/+1
| |\ \ \
| | * | | mce, workqueue: remove keventd_up() usageTejun Heo2016-09-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that workqueue can handle work item queueing from very early during boot, there is no need to gate schedule_work() with keventd_up(). Remove it. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: linux-edac@vger.kernel.org
* | | | | Merge tag 'driver-core-4.10-rc1' of ↵Linus Torvalds2016-12-131-0/+2
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here's the new driver core patches for 4.10-rc1. Big thing here is the nice addition of "functional dependencies" to the driver core. The idea has been talked about for a very long time, great job to Rafael for stepping up and implementing it. It's been tested for longer than the 4.9-rc1 date, we held off on merging it earlier in order to feel more comfortable about it. Other than that, it's just a handful of small other patches, some good cleanups to the mess that is the firmware class code, and we have a test driver for the deferred probe logic. All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (30 commits) firmware: Correct handling of fw_state_wait() return value driver core: Silence device links sphinx warning firmware: remove warning at documentation generation time drivers: base: dma-mapping: Fix typo in dmam_alloc_non_coherent comments driver core: test_async: fix up typo found by 0-day firmware: move fw_state_is_done() into UHM section firmware: do not use fw_lock for fw_state protection firmware: drop bit ops in favor of simple state machine firmware: refactor loading status firmware: fix usermode helper fallback loading driver core: firmware_class: convert to use class_groups driver core: devcoredump: convert to use class_groups driver core: class: add class_groups support kernfs: Declare two local data structures static driver-core: fix platform_no_drv_owner.cocci warnings drivers/base/memory.c: Remove unused 'first_page' variable driver core: add CLASS_ATTR_WO() drivers: base: cacheinfo: support DT overrides for cache properties drivers: base: cacheinfo: add pr_fmt logging drivers: base: cacheinfo: fix boot error message when acpi is enabled ...
| * | | | | drivers: base: cacheinfo: fix x86 with CONFIG_OF enabledSudeep Holla2016-11-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With CONFIG_OF enabled on x86, we get the following error on boot: " Failed to find cpu0 device node Unable to detect cache hierarchy from DT for CPU 0 " and the cacheinfo fails to get populated in the corresponding sysfs entries. This is because cache_setup_of_node looks for of_node for setting up the shared cpu_map without checking that it's already populated in the architecture specific callback. In order to indicate that the shared cpu_map is already populated, this patch introduces a boolean `cpu_map_populated` in struct cpu_cacheinfo that can be used by the generic code to skip cache_shared_cpu_map_setup. This patch also sets that boolean for x86. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | | | Merge tag 'acpi-4.10-rc1' of ↵Linus Torvalds2016-12-131-3/+0
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "The ACPICA code in the kernel gets updated as usual (included is upstream revision 20160930 and a few commits from the next one, with the rest waiting for an issue discovered in linux-next to be addressed) which brings in a couple of fixes and cleanups On top of that initial support for APEI on ARM64 is added, two new pieces of documentation are introduced, the properties-parsing code is updated to follow changes in the (external) documentation it is based on and there are a few updates of SoC drivers, some new blacklist entries, plus some assorted fixes and cleanups Specifics: - ACPICA update including upstream revision 20160930 and several commits beyond it (Bob Moore, Lv Zheng) - Initial support for ACPI APEI on ARM64 (Tomasz Nowicki) - New document describing the handling of _OSI and _REV in Linux (Len Brown) - New document describing the usage rules for _DSD properties (Rafael Wysocki) - Update of the ACPI properties-parsing code to reflect recent changes in the (external) documentation it is based on (Rafael Wysocki) - Updates of the ACPI LPSS and ACPI APD SoC drivers for additional hardware support (Andy Shevchenko, Nehal Shah) - New blacklist entries for _REV and video handling (Alex Hung, Hans de Goede, Michael Pobega) - ACPI battery driver fix to fall back to _BIF if _BIX fails (Dave Lambley) - NMI notifications handling fix for APEI (Prarit Bhargava) - Error code path fix for the ACPI CPPC library (Dan Carpenter) - Assorted cleanups (Andy Shevchenko, Longpeng Mike)" * tag 'acpi-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (31 commits) ACPICA: Utilities: Add new decode function for parser values ACPI / osl: Refactor acpi_os_get_root_pointer() to drop 'else':s ACPI / osl: Propagate actual error code for kstrtoul() ACPI / property: Document usage rules for _DSD properties ACPI: Document _OSI and _REV for Linux BIOS writers ACPI / APEI / ARM64: APEI initial support for ARM64 ACPI / APEI: Fix NMI notification handling ACPICA: Tables: Add an error message complaining driver bugs ACPICA: Tables: Add acpi_tb_unload_table() ACPICA: Tables: Cleanup acpi_tb_install_and_load_table() ACPICA: Events: Fix acpi_ev_initialize_region() return value ACPICA: Back port of "ACPICA: Dispatcher: Tune interpreter lock around AcpiEvInitializeRegion()" ACPICA: Namespace: Add acpi_ns_handle_to_name() ACPI / CPPC: set an error code on probe error path ACPI / video: Add force_native quirk for HP Pavilion dv6 ACPI / video: Add force_native quirk for Dell XPS 17 L702X ACPI / property: Hierarchical properties support update ACPI / LPSS: enable hard LLP for DMA ACPI / battery: If _BIX fails, retry with _BIF ACPI / video: Move ACPI_VIDEO_NOTIFY_* defines to acpi/video.h ..
| | \ \ \ \ \
| | \ \ \ \ \
| | \ \ \ \ \
| | \ \ \ \ \
| | \ \ \ \ \
| | \ \ \ \ \
| *-----. \ \ \ \ \ Merge branches 'acpi-soc', 'acpi-battery', 'acpi-video', 'acpi-cppc' and ↵Rafael J. Wysocki2016-12-121-3/+0
| |\ \ \ \ \ \ \ \ \ | | | | |_|_|_|_|/ / | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'acpi-apei' * acpi-soc: ACPI / LPSS: enable hard LLP for DMA ACPI / APD: Add clock frequency for future AMD I2C controller * acpi-battery: ACPI / battery: If _BIX fails, retry with _BIF * acpi-video: ACPI / video: Add force_native quirk for HP Pavilion dv6 ACPI / video: Add force_native quirk for Dell XPS 17 L702X ACPI / video: Move ACPI_VIDEO_NOTIFY_* defines to acpi/video.h * acpi-cppc: ACPI / CPPC: set an error code on probe error path * acpi-apei: ACPI / APEI / ARM64: APEI initial support for ARM64 ACPI / APEI: Fix NMI notification handling
| | | | | * | | | | ACPI / APEI / ARM64: APEI initial support for ARM64Tomasz Nowicki2016-12-021-3/+0
| | | | |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides APEI arch-specific bits for ARM64 Meanwhile, (1) Move HEST type (ACPI_HEST_TYPE_IA32_CORRECTED_CHECK) checking to a generic place. (2) Select HAVE_ACPI_APEI when EFI and ACPI is set on ARM64, because arch_apei_get_mem_attribute is using efi_mem_attributes() on ARM64. Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org> Tested-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Signed-off-by: Fu Wei <fu.wei@linaro.org> [ Fu Wei: improve && upstream ] Acked-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Tyler Baicar <tbaicar@codeaurora.org> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | | | | | | | | Merge tag 'pm-4.10-rc1' of ↵Linus Torvalds2016-12-131-0/+9
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "Again, cpufreq gets more changes than the other parts this time (one new driver, one old driver less, a bunch of enhancements of the existing code, new CPU IDs, fixes, cleanups) There also are some changes in cpuidle (idle injection rework, a couple of new CPU IDs, online/offline rework in intel_idle, fixes and cleanups), in the generic power domains framework (mostly related to supporting power domains containing CPUs), and in the Operating Performance Points (OPP) library (mostly related to supporting devices with multiple voltage regulators) In addition to that, the system sleep state selection interface is modified to make it easier for distributions with unchanged user space to support suspend-to-idle as the default system suspend method, some issues are fixed in the PM core, the latency tolerance PM QoS framework is improved a bit, the Intel RAPL power capping driver is cleaned up and there are some fixes and cleanups in the devfreq subsystem Specifics: - New cpufreq driver for Broadcom STB SoCs and a Device Tree binding for it (Markus Mayer) - Support for ARM Integrator/AP and Integrator/CP in the generic DT cpufreq driver and elimination of the old Integrator cpufreq driver (Linus Walleij) - Support for the zx296718, r8a7743 and r8a7745, Socionext UniPhier, and PXA SoCs in the the generic DT cpufreq driver (Baoyou Xie, Geert Uytterhoeven, Masahiro Yamada, Robert Jarzmik) - cpufreq core fix to eliminate races that may lead to using inactive policy objects and related cleanups (Rafael Wysocki) - cpufreq schedutil governor update to make it use SCHED_FIFO kernel threads (instead of regular workqueues) for doing delayed work (to reduce the response latency in some cases) and related cleanups (Viresh Kumar) - New cpufreq sysfs attribute for resetting statistics (Markus Mayer) - cpufreq governors fixes and cleanups (Chen Yu, Stratos Karafotis, Viresh Kumar) - Support for using generic cpufreq governors in the intel_pstate driver (Rafael Wysocki) - Support for per-logical-CPU P-state limits and the EPP/EPB (Energy Performance Preference/Energy Performance Bias) knobs in the intel_pstate driver (Srinivas Pandruvada) - New CPU ID for Knights Mill in intel_pstate (Piotr Luc) - intel_pstate driver modification to use the P-state selection algorithm based on CPU load on platforms with the system profile in the ACPI tables set to "mobile" (Srinivas Pandruvada) - intel_pstate driver cleanups (Arnd Bergmann, Rafael Wysocki, Srinivas Pandruvada) - cpufreq powernv driver updates including fast switching support (for the schedutil governor), fixes and cleanus (Akshay Adiga, Andrew Donnellan, Denis Kirjanov) - acpi-cpufreq driver rework to switch it over to the new CPU offline/online state machine (Sebastian Andrzej Siewior) - Assorted cleanups in cpufreq drivers (Wei Yongjun, Prashanth Prakash) - Idle injection rework (to make it use the regular idle path instead of a home-grown custom one) and related powerclamp thermal driver updates (Peter Zijlstra, Jacob Pan, Petr Mladek, Sebastian Andrzej Siewior) - New CPU IDs for Atom Z34xx and Knights Mill in intel_idle (Andy Shevchenko, Piotr Luc) - intel_idle driver cleanups and switch over to using the new CPU offline/online state machine (Anna-Maria Gleixner, Sebastian Andrzej Siewior) - cpuidle DT driver update to support suspend-to-idle properly (Sudeep Holla) - cpuidle core cleanups and misc updates (Daniel Lezcano, Pan Bian, Rafael Wysocki) - Preliminary support for power domains including CPUs in the generic power domains (genpd) framework and related DT bindings (Lina Iyer) - Assorted fixes and cleanups in the generic power domains (genpd) framework (Colin Ian King, Dan Carpenter, Geert Uytterhoeven) - Preliminary support for devices with multiple voltage regulators and related fixes and cleanups in the Operating Performance Points (OPP) library (Viresh Kumar, Masahiro Yamada, Stephen Boyd) - System sleep state selection interface rework to make it easier to support suspend-to-idle as the default system suspend method (Rafael Wysocki) - PM core fixes and cleanups, mostly related to the interactions between the system suspend and runtime PM frameworks (Ulf Hansson, Sahitya Tummala, Tony Lindgren) - Latency tolerance PM QoS framework imorovements (Andrew Lutomirski) - New Knights Mill CPU ID for the Intel RAPL power capping driver (Piotr Luc) - Intel RAPL power capping driver fixes, cleanups and switch over to using the new CPU offline/online state machine (Jacob Pan, Thomas Gleixner, Sebastian Andrzej Siewior) - Fixes and cleanups in the exynos-ppmu, exynos-nocp, rk3399_dmc, rockchip-dfi devfreq drivers and the devfreq core (Axel Lin, Chanwoo Choi, Javier Martinez Canillas, MyungJoo Ham, Viresh Kumar) - Fix for false-positive KASAN warnings during resume from ACPI S3 (suspend-to-RAM) on x86 (Josh Poimboeuf) - Memory map verification during resume from hibernation on x86 to ensure a consistent address space layout (Chen Yu) - Wakeup sources debugging enhancement (Xing Wei) - rockchip-io AVS driver cleanup (Shawn Lin)" * tag 'pm-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (127 commits) devfreq: rk3399_dmc: Don't use OPP structures outside of RCU locks devfreq: rk3399_dmc: Remove dangling rcu_read_unlock() devfreq: exynos: Don't use OPP structures outside of RCU locks Documentation: intel_pstate: Document HWP energy/performance hints cpufreq: intel_pstate: Support for energy performance hints with HWP cpufreq: intel_pstate: Add locking around HWP requests PM / sleep: Print active wakeup sources when blocking on wakeup_count reads PM / core: Fix bug in the error handling of async suspend PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend PM / Domains: Fix compatible for domain idle state PM / OPP: Don't WARN on multiple calls to dev_pm_opp_set_regulators() PM / OPP: Allow platform specific custom set_opp() callbacks PM / OPP: Separate out _generic_set_opp() PM / OPP: Add infrastructure to manage multiple regulators PM / OPP: Pass struct dev_pm_opp_supply to _set_opp_voltage() PM / OPP: Manage supply's voltage/current in a separate structure PM / OPP: Don't use OPP structure outside of rcu protected section PM / OPP: Reword binding supporting multiple regulators per device PM / OPP: Fix incorrect cpu-supply property in binding cpuidle: Add a kerneldoc comment to cpuidle_use_deepest_state() ..
| | \ \ \ \ \ \ \ \
| | \ \ \ \ \ \ \ \
| *-. \ \ \ \ \ \ \ \ Merge branches 'pm-sleep' and 'powercap'Rafael J. Wysocki2016-12-121-0/+9
| |\ \ \ \ \ \ \ \ \ \ | | |_|/ / / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * pm-sleep: PM / sleep: Print active wakeup sources when blocking on wakeup_count reads x86/suspend: fix false positive KASAN warning on suspend/resume PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0 flag PM / sleep: System sleep state selection interface rework PM / hibernate: Verify the consistent of e820 memory map by md5 digest * powercap: powercap / RAPL: Add Knights Mill CPUID powercap/intel_rapl: fix and tidy up error handling powercap/intel_rapl: Track active CPUs internally powercap/intel_rapl: Cleanup duplicated init code powercap/intel rapl: Convert to hotplug state machine powercap/intel_rapl: Propagate error code when registration fails powercap/intel_rapl: Add missing domain data update on hotplug
| | * | | | | | | | | x86/suspend: fix false positive KASAN warning on suspend/resumeJosh Poimboeuf2016-12-061-0/+9
| | | |_|_|_|/ / / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resuming from a suspend operation is showing a KASAN false positive warning: BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 Read of size 8 by task pm-suspend/7774 page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x2ffff0000000000() page dumped because: kasan: bad access detected CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G B 4.9.0-rc7+ #8 Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 Call Trace: dump_stack+0x63/0x82 kasan_report_error+0x4b4/0x4e0 ? acpi_hw_read_port+0xd0/0x1ea ? kfree_const+0x22/0x30 ? acpi_hw_validate_io_request+0x1a6/0x1a6 __asan_report_load8_noabort+0x61/0x70 ? unwind_get_return_address+0x11d/0x130 unwind_get_return_address+0x11d/0x130 ? unwind_next_frame+0x97/0xf0 __save_stack_trace+0x92/0x100 save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 ? save_stack_trace+0x1b/0x20 ? save_stack+0x46/0xd0 ? kasan_kmalloc+0xad/0xe0 ? kasan_slab_alloc+0x12/0x20 ? acpi_hw_read+0x2b6/0x3aa ? acpi_hw_validate_register+0x20b/0x20b ? acpi_hw_write_port+0x72/0xc7 ? acpi_hw_write+0x11f/0x15f ? acpi_hw_read_multiple+0x19f/0x19f ? memcpy+0x45/0x50 ? acpi_hw_write_port+0x72/0xc7 ? acpi_hw_write+0x11f/0x15f ? acpi_hw_read_multiple+0x19f/0x19f ? kasan_unpoison_shadow+0x36/0x50 kasan_kmalloc+0xad/0xe0 kasan_slab_alloc+0x12/0x20 kmem_cache_alloc_trace+0xbc/0x1e0 ? acpi_get_sleep_type_data+0x9a/0x578 acpi_get_sleep_type_data+0x9a/0x578 acpi_hw_legacy_wake_prep+0x88/0x22c ? acpi_hw_legacy_sleep+0x3c7/0x3c7 ? acpi_write_bit_register+0x28d/0x2d3 ? acpi_read_bit_register+0x19b/0x19b acpi_hw_sleep_dispatch+0xb5/0xba acpi_leave_sleep_state_prep+0x17/0x19 acpi_suspend_enter+0x154/0x1e0 ? trace_suspend_resume+0xe8/0xe8 suspend_devices_and_enter+0xb09/0xdb0 ? printk+0xa8/0xd8 ? arch_suspend_enable_irqs+0x20/0x20 ? try_to_freeze_tasks+0x295/0x600 pm_suspend+0x6c9/0x780 ? finish_wait+0x1f0/0x1f0 ? suspend_devices_and_enter+0xdb0/0xdb0 state_store+0xa2/0x120 ? kobj_attr_show+0x60/0x60 kobj_attr_store+0x36/0x70 sysfs_kf_write+0x131/0x200 kernfs_fop_write+0x295/0x3f0 __vfs_write+0xef/0x760 ? handle_mm_fault+0x1346/0x35e0 ? do_iter_readv_writev+0x660/0x660 ? __pmd_alloc+0x310/0x310 ? do_lock_file_wait+0x1e0/0x1e0 ? apparmor_file_permission+0x18/0x20 ? security_file_permission+0x73/0x1c0 ? rw_verify_area+0xbd/0x2b0 vfs_write+0x149/0x4a0 SyS_write+0xd9/0x1c0 ? SyS_read+0x1c0/0x1c0 entry_SYSCALL_64_fastpath+0x1e/0xad Memory state around the buggy address: ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4 ^ ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00 KASAN instrumentation poisons the stack when entering a function and unpoisons it when exiting the function. However, in the suspend path, some functions never return, so their stack never gets unpoisoned, resulting in stale KASAN shadow data which can cause later false positive warnings like the one above. Reported-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | | | | | | | | | Merge branch 'akpm' (patches from Andrew)Linus Torvalds2016-12-132-1/+25
|\ \ \ \ \ \ \ \ \ \ | |_|_|_|_|_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge updates from Andrew Morton: - various misc bits - most of MM (quite a lot of MM material is awaiting the merge of linux-next dependencies) - kasan - printk updates - procfs updates - MAINTAINERS - /lib updates - checkpatch updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (123 commits) init: reduce rootwait polling interval time to 5ms binfmt_elf: use vmalloc() for allocation of vma_filesz checkpatch: don't emit unified-diff error for rename-only patches checkpatch: don't check c99 types like uint8_t under tools checkpatch: avoid multiple line dereferences checkpatch: don't check .pl files, improve absolute path commit log test scripts/checkpatch.pl: fix spelling checkpatch: don't try to get maintained status when --no-tree is given lib/ida: document locking requirements a bit better lib/rbtree.c: fix typo in comment of ____rb_erase_color lib/Kconfig.debug: make CONFIG_STRICT_DEVMEM depend on CONFIG_DEVMEM MAINTAINERS: add drm and drm/i915 irc channels MAINTAINERS: add "C:" for URI for chat where developers hang out MAINTAINERS: add drm and drm/i915 bug filing info MAINTAINERS: add "B:" for URI where to file bugs get_maintainer: look for arbitrary letter prefixes in sections printk: add Kconfig option to set default console loglevel printk/sound: handle more message headers printk/btrfs: handle more message headers printk/kdb: handle more message headers ...
| * | | | | | | | | x86/ldt: use vfree_atomic() to free ldt entriesAndrey Ryabinin2016-12-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vfree() is going to use sleeping lock. free_ldt_struct() may be called with disabled preemption, therefore we must use vfree_atomic() here. E.g. call trace: vfree() free_ldt_struct() destroy_context_ldt() __mmdrop() finish_task_switch() schedule_tail() ret_from_fork() Link: http://lkml.kernel.org/r/1479474236-4139-7-git-send-email-hch@lst.de Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Joel Fernandes <joelaf@google.com> Cc: Jisheng Zhang <jszhang@marvell.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: John Dias <joaodias@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | mm: remove x86-only restriction of movable_nodeReza Arbab2016-12-131-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit c5320926e370 ("mem-hotplug: introduce movable_node boot option"), the memblock allocation direction is changed to bottom-up and then back to top-down like this: 1. memblock_set_bottom_up(true), called by cmdline_parse_movable_node(). 2. memblock_set_bottom_up(false), called by x86's numa_init(). Even though (1) occurs in generic mm code, it is wrapped by #ifdef CONFIG_MOVABLE_NODE, which depends on X86_64. This means that when we extend CONFIG_MOVABLE_NODE to non-x86 arches, things will be unbalanced. (1) will happen for them, but (2) will not. This toggle was added in the first place because x86 has a delay between adding memblocks and marking them as hotpluggable. Since other arches do this marking either immediately or not at all, they do not require the bottom-up toggle. So, resolve things by moving (1) from cmdline_parse_movable_node() to x86's setup_arch(), immediately after the movable_node parameter has been parsed. Link: http://lkml.kernel.org/r/1479160961-25840-3-git-send-email-arbab@linux.vnet.ibm.com Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alistair Popple <apopple@au1.ibm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bharata B Rao <bharata@linux.vnet.ibm.com> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Stewart Smith <stewart@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | | | Merge branch 'timers-core-for-linus' of ↵Linus Torvalds2016-12-131-0/+9
|\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "The time/timekeeping/timer folks deliver with this update: - Fix a reintroduced signed/unsigned issue and cleanup the whole signed/unsigned mess in the timekeeping core so this wont happen accidentaly again. - Add a new trace clock based on boot time - Prevent injection of random sleep times when PM tracing abuses the RTC for storage - Make posix timers configurable for real tiny systems - Add tracepoints for the alarm timer subsystem so timer based suspend wakeups can be instrumented - The usual pile of fixes and updates to core and drivers" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) timekeeping: Use mul_u64_u32_shr() instead of open coding it timekeeping: Get rid of pointless typecasts timekeeping: Make the conversion call chain consistently unsigned timekeeping_Force_unsigned_clocksource_to_nanoseconds_conversion alarmtimer: Add tracepoints for alarm timers trace: Update documentation for mono, mono_raw and boot clock trace: Add an option for boot clock as trace clock timekeeping: Add a fast and NMI safe boot clock timekeeping/clocksource_cyc2ns: Document intended range limitation timekeeping: Ignore the bogus sleep time if pm_trace is enabled selftests/timers: Fix spelling mistake "Asyncrhonous" -> "Asynchronous" clocksource/drivers/bcm2835_timer: Unmap region obtained by of_iomap clocksource/drivers/arm_arch_timer: Map frame with of_io_request_and_map() arm64: dts: rockchip: Arch counter doesn't tick in system suspend clocksource/drivers/arm_arch_timer: Don't assume clock runs in suspend posix-timers: Make them configurable posix_cpu_timers: Move the add_device_randomness() call to a proper place timer: Move sys_alarm from timer.c to itimer.c ptp_clock: Allow for it to be optional Kconfig: Regenerate *.c_shipped files after previous changes ...
| * | | | | | | | | | timekeeping: Ignore the bogus sleep time if pm_trace is enabledChen Yu2016-11-291-0/+9
| | |_|_|_|/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Power management suspend/resume tracing (ab)uses the RTC to store suspend/resume information persistently. As a consequence the RTC value is clobbered when timekeeping is resumed and tries to inject the sleep time. Commit a4f8f6667f09 ("timekeeping: Cap array access in timekeeping_debug") plugged a out of bounds array access in the timekeeping debug code which was caused by the clobbered RTC value, but we still use the clobbered RTC value for sleep time injection into kernel timekeeping, which will result in random adjustments depending on the stored "hash" value. To prevent this keep track of the RTC clobbering and ignore the invalid RTC timestamp at resume. If the system resumed successfully clear the flag, which marks the RTC as unusable, warn the user about the RTC clobber and recommend to adjust the RTC with 'ntpdate' or 'rdate'. [jstultz: Fixed up pr_warn formating, and implemented suggestions from Ingo] [ tglx: Rewrote changelog ] Originally-from: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Prarit Bhargava <prarit@redhat.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Xunlei Pang <xlpang@redhat.com> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/1480372524-15181-3-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | | | | | | | | | Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds2016-12-135-260/+146
|\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp hotplug updates from Thomas Gleixner: "This is the final round of converting the notifier mess to the state machine. The removal of the notifiers and the related infrastructure will happen around rc1, as there are conversions outstanding in other trees. The whole exercise removed about 2000 lines of code in total and in course of the conversion several dozen bugs got fixed. The new mechanism allows to test almost every hotplug step standalone, so usage sites can exercise all transitions extensively. There is more room for improvement, like integrating all the pointlessly different architecture mechanisms of synchronizing, setting cpus online etc into the core code" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) tracing/rb: Init the CPU mask on allocation soc/fsl/qbman: Convert to hotplug state machine soc/fsl/qbman: Convert to hotplug state machine zram: Convert to hotplug state machine KVM/PPC/Book3S HV: Convert to hotplug state machine arm64/cpuinfo: Convert to hotplug state machine arm64/cpuinfo: Make hotplug notifier symmetric mm/compaction: Convert to hotplug state machine iommu/vt-d: Convert to hotplug state machine mm/zswap: Convert pool to hotplug state machine mm/zswap: Convert dst-mem to hotplug state machine mm/zsmalloc: Convert to hotplug state machine mm/vmstat: Convert to hotplug state machine mm/vmstat: Avoid on each online CPU loops mm/vmstat: Drop get_online_cpus() from init_cpu_node_state/vmstat_cpu_dead() tracing/rb: Convert to hotplug state machine oprofile/nmi timer: Convert to hotplug state machine net/iucv: Use explicit clean up labels in iucv_init() x86/pci/amd-bus: Convert to hotplug state machine x86/oprofile/nmi: Convert to hotplug state machine ...
| * | | | | | | | | | x86/msr: Convert to hotplug state machineSebastian Andrzej Siewior2016-11-221-54/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Install the callbacks via the state machine and let the core invoke the callbacks on the already online CPUs. Move the callbacks to online/offline as there is no point in having the files around before the cpu is online and until its completely gone. [ tglx: Move the callbacks to online/offline ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: rt@linuxtronix.de Link: http://lkml.kernel.org/r/20161117183541.8588-4-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>