summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'x86-smap-for-linus' of ↵Linus Torvalds2013-09-046-11/+46
|\ | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SMAP fixes from Ingo Molnar: "Fixes for Intel SMAP support, to fix SIGSEGVs during bootup" * 'x86-smap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Introduce [compat_]save_altstack_ex() to unbreak x86 SMAP x86, smap: Handle csum_partial_copy_*_user()
| * Introduce [compat_]save_altstack_ex() to unbreak x86 SMAPAl Viro2013-09-014-4/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For performance reasons, when SMAP is in use, SMAP is left open for an entire put_user_try { ... } put_user_catch(); block, however, calling __put_user() in the middle of that block will close SMAP as the STAC..CLAC constructs intentionally do not nest. Furthermore, using __put_user() rather than put_user_ex() here is bad for performance. Thus, introduce new [compat_]save_altstack_ex() helpers that replace __[compat_]save_altstack() for x86, being currently the only architecture which supports put_user_try { ... } put_user_catch(). Reported-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # v3.8+ Link: http://lkml.kernel.org/n/tip-es5p6y64if71k8p5u08agv9n@git.kernel.org
| * x86, smap: Handle csum_partial_copy_*_user()H. Peter Anvin2013-09-012-7/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | Add SMAP annotations to csum_partial_copy_to/from_user(). These functions legitimately access user space and thus need to set the AC flag. TODO: add explicit checks that the side with the kernel space pointer really points into kernel space. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/n/tip-2aps0u00eer658fd5xyanan7@git.kernel.org Cc: <stable@vger.kernel.org> # v3.7+
* | Merge branch 'x86-ras-for-linus' of ↵Linus Torvalds2013-09-0420-135/+523
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS changes from Ingo Molnar: "[ The reason for drivers/ updates is that Boris asked for the drivers/edac/ changes to go via x86/ras in this cycle ] Main changes: - AMD CPUs: . Add ECC event decoding support for new F15h models . Various erratum fixes . Fix single-channel on dual-channel-controllers bug. - Intel CPUs: . UC uncorrectable memory error parsing fix . Add support for CMC (Corrected Machine Check) 'FF' (Firmware First) flag in the APEI HEST - Various cleanups and fixes" * 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: amd64_edac: Fix incorrect wraparounds amd64_edac: Correct erratum 505 range cpc925_edac: Use proper array termination x86/mce, acpi/apei: Only disable banks listed in HEST if mce is configured amd64_edac: Get rid of boot_cpu_data accesses amd64_edac: Add ECC decoding support for newer F15h models x86, amd_nb: Clarify F15h, model 30h GART and L3 support pci_ids: Add PCI device ID functions 3 and 4 for newer F15h models. x38_edac: Make a local function static i3200_edac: Make a local function static x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors APEI/ERST: Fix error message formatting amd64_edac: Fix single-channel setups EDAC: Replace strict_strtol() with kstrtol() mce: acpi/apei: Soft-offline a page on firmware GHES notification mce: acpi/apei: Add a boot option to disable ff mode for corrected errors mce: acpi/apei: Honour Firmware First for MCA banks listed in APEI HEST CMC
| * \ Merge tag 'edac_fixes_for_3.12' of ↵Ingo Molnar2013-08-281-9/+10
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/ras Pull RAS fixes from Boris Petkov: "Two fixlets for Erratum 505 ranges and overflowing variables." Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | amd64_edac: Fix incorrect wraparoundsAravind Gopalakrishnan2013-08-271-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dct_base and dct_limit obtain 32 bit register values when they read their respective pci config space registers. A left shift beyond 32 bits will cause them to wrap around. Similar case for chan_addr as can be seen from the bug report (link below). In the patch, we rectify this by casting chan_addr to u64 and by comparing dct_base and dct_limit against properly shifted sys_addr in order to compare the correct bits. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/20130819132302.GA12171@elgon.mountain Signed-off-by: Borislav Petkov <bp@suse.de>
| | * | amd64_edac: Correct erratum 505 rangeBorislav Petkov2013-08-271-4/+4
| |/ / | | | | | | | | | | | | | | | | | | | | | Basically we want to cover all 0x0-0xf models, i.e. Orochi and later. Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/20130819192321.GF4165@pd.tnic Signed-off-by: Borislav Petkov <bp@suse.de>
| * | Merge tag 'edac_for_3.12' of ↵Ingo Molnar2013-08-155-8/+15
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/ras Pull RAS/EDAC updates from Boris Petkov: "An amd64_edac fix for single channel configurations + trivial cleanups courtesy of Jingoo Han." Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | cpc925_edac: Use proper array terminationJingoo Han2013-08-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The struct should be terminated by using empty braces in order to fix the following sparse warning. drivers/edac/cpc925_edac.c:792:10: warning: Using plain integer as NULL pointer Signed-off-by: Jingoo Han <jg1.han@samsung.com> [ drop obvious comment ] Signed-off-by: Borislav Petkov <bp@suse.de>
| | * | x38_edac: Make a local function staticJingoo Han2013-08-091-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make a local function static in order to fix the following sparse warning: drivers/edac/x38_edac.c:252:14: warning: symbol 'x38_map_mchbar' was not declared. Should it be static? Signed-off-by: Jingoo Han <jg1.han@samsung.com> [ Boris: Correct commit message ] Signed-off-by: Borislav Petkov <bp@suse.de>
| | * | i3200_edac: Make a local function staticJingoo Han2013-08-091-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This local symbol is used only in this file. Fix the following sparse warnings: drivers/edac/i3200_edac.c:264:14: warning: symbol 'i3200_map_mchbar' was not declared. Should it be static? Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Borislav Petkov <bp@suse.de>
| | * | amd64_edac: Fix single-channel setupsBorislav Petkov2013-07-291-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It can happen that configurations are running in a single-channel mode even with a dual-channel memory controller, by, say, putting the DIMMs only on the one channel and leaving the other empty. This causes a problem in init_csrows which implicitly assumes that when the second channel is enabled, i.e. channel 1, the struct dimm hierarchy will be present. Which is not. So always allocate two channels unconditionally. This provides for the nice side effect that the data structures are initialized so some day, when memory hotplug is supported, it should just work out of the box when all of a sudden a second channel appears. Reported-and-tested-by: Roger Leigh <rleigh@debian.org> Signed-off-by: Borislav Petkov <bp@suse.de>
| | * | EDAC: Replace strict_strtol() with kstrtol()Jingoo Han2013-07-241-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The usage of strict_strtol() is not preferred, because strict_strtol() is obsolete. Thus, kstrtol() should be used. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Borislav Petkov <bp@suse.de>
| * | | Merge tag 'amd_f15_m30' of ↵Ingo Molnar2013-08-14192-1220/+2172
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/ras Pull AMD F15h, model 0x30 and later enablement stuff, more specifically EDAC support, from Borislav Petkov. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | amd64_edac: Get rid of boot_cpu_data accessesBorislav Petkov2013-08-122-48/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we cache (family, model, stepping) locally, use them instead of boot_cpu_data. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>
| | * | | amd64_edac: Add ECC decoding support for newer F15h modelsAravind Gopalakrishnan2013-08-122-34/+270
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On newer models, support has been included for upto 4 DCT's, however, only DCT0 and DCT3 are currently configured (cf BKDG Section 2.10). Also, the routing DRAM Requests algorithm is different for F15h M30h. Thus it is cleaner to use a brand new function rather than adding quirks to the more generic f1x_match_to_this_node(). Refer to "2.10.5 DRAM Routing Requests" in the BKDG for further info. Tested on Fam15h M30h with ECC turned on using mce_amd_inj facility and verified to be functionally correct. While at it, verify if erratum workarounds for E505 and E637 still hold. From email conversations within AMD, the current status of the errata is: * Erratum 505: fixed in model 0x1, stepping 0x1 and later. * Erratum 637: not fixed. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> [ Cleanups, corrections ] Signed-off-by: Borislav Petkov <bp@suse.de>
| | * | | x86, amd_nb: Clarify F15h, model 30h GART and L3 supportAravind Gopalakrishnan2013-08-121-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | F15h, models 0x30 and later don't have a GART. Note that. Also check CPUID leaf 0x80000006 for L3 prescence because there are models which don't sport an L3 cache. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> [ Boris: rewrite commit message, cleanup comments. ] Signed-off-by: Borislav Petkov <bp@suse.de>
| | * | | pci_ids: Add PCI device ID functions 3 and 4 for newer F15h models.Aravind Gopalakrishnan2013-08-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add PCI device IDs for AMD F15h, model 30h. They will be used in amd_nb.c and amd64_edac.c Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de>
| * | | | Merge tag 'please-pull-mce-f-bit' of ↵Ingo Molnar2013-08-12386-2191/+3851
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/ras Pull MCE-uncorrected-error fix from Tony Luck: "Bit 12 may or may not be set in MCi_STATUS.MCACOD when an uncorrected error is reported. Ignore it when checking error signatures." Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | | x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errorsTony Luck2013-08-051-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 0x1000 bit of the MCACOD field of machine check MCi_STATUS registers is only defined for corrected errors (where it means that hardware may be filtering errors see SDM section 15.9.2.1). For uncorrected errors it may, or may not be set - so we should mask it out when checking for the architecturaly defined recoverable error signatures (see SDM 15.9.3.1 and 15.9.3.2) Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | | | | x86/mce, acpi/apei: Only disable banks listed in HEST if mce is configuredNaveen N. Rao2013-08-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Randconfig testing found this build error: >> hest.c(.init.text+0x6004): undefined reference to 'mce_disable_bank' Fix by wrapping body of hest_parse_cmc() inside #ifdef CONFIG_X86_MCE Reported-by: "Wu, Fengguang" <fengguang.wu@intel.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/0129220@agluck-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | Merge branch 'x86/mce' into x86/rasIngo Molnar2013-08-1211-20/+150
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pursue a single RAS/MCE topic branch on x86. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | | | mce: acpi/apei: Soft-offline a page on firmware GHES notificationNaveen N. Rao2013-07-103-10/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the firmware indicates in GHES error data entry that the error threshold has exceeded for a corrected error event, then we try to soft-offline the page. This could be called in interrupt context, so we queue this up similar to how we handle memory failure scenarios. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com>
| | * | | | | mce: acpi/apei: Add a boot option to disable ff mode for corrected errorsNaveen N. Rao2013-07-084-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a boot option to disable firmware first mode for corrected errors. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com>
| | * | | | | mce: acpi/apei: Honour Firmware First for MCA banks listed in APEI HEST CMCNaveen N. Rao2013-07-085-10/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Corrected Machine Check structure (CMC) in HEST has a flag which can be set by the firmware to indicate to the OS that it prefers to process the corrected error events first. In this scenario, the OS is expected to not monitor for corrected errors (through CMCI/polling). Instead, the firmware notifies the OS on corrected error events through GHES. Linux already has support for GHES. This patch adds support for parsing CMC structure and to disable CMCI/polling if the firmware first flag is set. Further, the list of machine check bank structures at the end of CMC is used to determine which MCA banks function in FF mode, so that we continue to monitor error events on the other banks. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | | | | | Merge tag 'ras_for_3.12' into x86/rasH. Peter Anvin2013-08-021-28/+23
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ERST error message formatting cleanup. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| | * | | | | | APEI/ERST: Fix error message formattingBorislav Petkov2013-07-311-28/+23
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ... according to acpi/apei/ conventions. Use standard pr_fmt prefix while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
* | | | | | | Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds2013-09-041-10/+3
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform documentation fix from Ingo Molnar. * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/acpi: Correct out-of-date comment of __acpi_map_table()
| * | | | | | | x86/acpi: Correct out-of-date comment of __acpi_map_table()Zhang Yanfei2013-07-261-10/+3
| | |_|_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The implementation of function __acpi_map_table() has been changed long time ago, and now it directly invokes early_ioremap() to setup the temporarily acpi table mappings. So correct its out-of-date comment. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: len.brown@intel.com Cc: pavel@ucw.cz Cc: rjw@sisk.pl Link: http://lkml.kernel.org/r/51EE7F1C.9020506@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | | | Merge branch 'x86-paravirt-for-linus' of ↵Linus Torvalds2013-09-049-58/+50
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt changes from Ingo Molnar: "Hypervisor signature detection cleanup and fixes - the goal is to make KVM guests run better on MS/Hyperv and to generalize and factor out the code a bit" * 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Correctly detect hypervisor x86, kvm: Switch to use hypervisor_cpuid_base() xen: Switch to use hypervisor_cpuid_base() x86: Introduce hypervisor_cpuid_base()
| * | | | | | | x86: Correctly detect hypervisorJason Wang2013-08-056-28/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We try to handle the hypervisor compatibility mode by detecting hypervisor through a specific order. This is not robust, since hypervisors may implement each others features. This patch tries to handle this situation by always choosing the last one in the CPUID leaves. This is done by letting .detect() return a priority instead of true/false and just re-using the CPUID leaf where the signature were found as the priority (or 1 if it was found by DMI). Then we can just pick hypervisor who has the highest priority. Other sophisticated detection method could also be implemented on top. Suggested by H. Peter Anvin and Paolo Bonzini. Acked-by: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Doug Covelli <dcovelli@vmware.com> Cc: Borislav Petkov <bp@suse.de> Cc: Dan Hecht <dhecht@vmware.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: http://lkml.kernel.org/r/1374742475-2485-4-git-send-email-jasowang@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | x86, kvm: Switch to use hypervisor_cpuid_base()Jason Wang2013-08-051-15/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch to use hypervisor_cpuid_base() to detect KVM. Cc: Gleb Natapov <gleb@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: http://lkml.kernel.org/r/1374742475-2485-3-git-send-email-jasowang@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | xen: Switch to use hypervisor_cpuid_base()Jason Wang2013-08-051-15/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch to use hypervisor_cpuid_base() to detect Xen. Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: http://lkml.kernel.org/r/1374742475-2485-2-git-send-email-jasowang@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | x86: Introduce hypervisor_cpuid_base()Jason Wang2013-08-051-0/+15
| | |_|_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduce hypervisor_cpuid_base() which loop test the hypervisor existence function until the signature match and check the number of leaves if required. This could be used by Xen/KVM guest to detect the existence of hypervisor. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: http://lkml.kernel.org/r/1374742475-2485-1-git-send-email-jasowang@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | | | | | | Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2013-09-046-26/+31
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm changes from Ingo Molnar: "Misc smaller fixes: - a parse_setup_data() boot crash fix - a memblock and an __early_ioremap cleanup - turn the always-on CONFIG_ARCH_MEMORY_PROBE=y into a configurable option and turn it off - it's an unrobust debug facility, it shouldn't be enabled by default" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: avoid remapping data in parse_setup_data() x86: Use memblock_set_current_limit() to set limit for memblock. mm: Remove unused variable idx0 in __early_ioremap() mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by default
| * | | | | | | x86: avoid remapping data in parse_setup_data()Linn Crosetto2013-08-143-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Type SETUP_PCI, added by setup_efi_pci(), may advertise a ROM size larger than early_memremap() is able to handle, which is currently limited to 256kB. If this occurs it leads to a NULL dereference in parse_setup_data(). To avoid this, remap the setup_data header and allow parsing functions for individual types to handle their own data remapping. Signed-off-by: Linn Crosetto <linn@hp.com> Link: http://lkml.kernel.org/r/1376430401-67445-1-git-send-email-linn@hp.com Acked-by: Yinghai Lu <yinghai@kernel.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | x86: Use memblock_set_current_limit() to set limit for memblock.Tang Chen2013-08-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In setup_arch() of x86, it set memblock.current_limit directly. We should use memblock_set_current_limit(). If the implementation is changed, it is easy to maintain. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Link: http://lkml.kernel.org/r/1376451844-15682-1-git-send-email-tangchen@cn.fujitsu.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | mm: Remove unused variable idx0 in __early_ioremap()Jianguo Wu2013-08-131-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit: 8827247ffcc ("x86: don't define __this_fixmap_does_not_exist()") variable idx0 is no longer needed, so just remove it. Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <linux-mm@kvack.org> Cc: <wangchen@cn.fujitsu.com> Cc: Hanjun Guo <guohanjun@huawei.com> Link: http://lkml.kernel.org/r/5209A173.3090600@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | mm/hotplug, x86: Disable ARCH_MEMORY_PROBE by defaultToshi Kani2013-07-222-8/+14
| | |/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CONFIG_ARCH_MEMORY_PROBE enables the /sys/devices/system/memory/probe interface, which allows a given memory address to be hot-added as follows: # echo start_address_of_new_memory > /sys/devices/system/memory/probe (See Documentation/memory-hotplug.txt for more details.) This probe interface is required on powerpc. On x86, however, ACPI notifies a memory hotplug event to the kernel, which performs its hotplug operation as the result. Therefore, regular users do not need this interface on x86. This probe interface is also error-prone and misleading that the kernel blindly adds a given memory address without checking if the memory is present on the system; no probing is done despite of its name. The kernel crashes when a user requests to online a memory block that is not present on the system. This interface is currently used for testing as it can fake a hotplug event. This patch disables CONFIG_ARCH_MEMORY_PROBE by default on x86, adds its Kconfig menu entry on x86, and clarifies its use in Documentation/ memory-hotplug.txt. Signed-off-by: Toshi Kani <toshi.kani@hp.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: linux-mm@kvack.org Cc: dave@sr71.net Cc: isimatu.yasuaki@jp.fujitsu.com Cc: tangchen@cn.fujitsu.com Cc: vasilis.liaskovitis@profitbricks.com Link: http://lkml.kernel.org/r/1374256068-26016-1-git-send-email-toshi.kani@hp.com [ Edited it slightly. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | | | Merge branch 'x86-kaslr-for-linus' of ↵Linus Torvalds2013-09-048-40/+97
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 relocation changes from Ingo Molnar: "This tree contains a single change, ELF relocation handling in C - one of the kernel randomization patches that makes sense even without randomization present upstream" * 'x86-kaslr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, relocs: Move ELF relocation handling to C
| * | | | | | | x86, relocs: Move ELF relocation handling to CKees Cook2013-08-088-40/+97
| | |/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Moves the relocation handling into C, after decompression. This requires that the decompressed size is passed to the decompression routine as well so that relocations can be found. Only kernels that need relocation support will use the code (currently just x86_32), but this is laying the ground work for 64-bit using it in support of KASLR. Based on work by Neill Clift and Michael Davidson. Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/20130708161517.GA4832@www.outflux.net Acked-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | | | | | | Merge branch 'timers-nohz-for-linus' of ↵Linus Torvalds2013-09-0421-327/+545
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timers/nohz changes from Ingo Molnar: "It mostly contains fixes and full dynticks off-case optimizations, by Frederic Weisbecker" * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) nohz: Include local CPU in full dynticks global kick nohz: Optimize full dynticks's sched hooks with static keys nohz: Optimize full dynticks state checks with static keys nohz: Rename a few state variables vtime: Always debug check snapshot source _before_ updating it vtime: Always scale generic vtime accounting results vtime: Optimize full dynticks accounting off case with static keys vtime: Describe overriden functions in dedicated arch headers m68k: hardirq_count() only need preempt_mask.h hardirq: Split preempt count mask definitions context_tracking: Split low level state headers vtime: Fix racy cputime delta update vtime: Remove a few unneeded generic vtime state checks context_tracking: User/kernel broundary cross trace events context_tracking: Optimize context switch off case with static keys context_tracking: Optimize guest APIs off case with static key context_tracking: Optimize main APIs off case with static key context_tracking: Ground setup for static key use context_tracking: Remove full dynticks' hacky dependency on wide context tracking nohz: Only enable context tracking on full dynticks CPUs ...
| * | | | | | | nohz: Include local CPU in full dynticks global kickFrederic Weisbecker2013-08-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tick_nohz_full_kick_all() is useful to notify all full dynticks CPUs that there is a system state change to checkout before re-evaluating the need for the tick. Unfortunately this is implemented using smp_call_function_many() that ignores the local CPU. This CPU also needs to re-evaluate the tick. on_each_cpu_mask() is not useful either because we don't want to re-evaluate the tick state in place but asynchronously from an IPI to avoid messing up with any random locking scenario. So lets call tick_nohz_full_kick() from tick_nohz_full_kick_all() so that the usual irq work takes care of it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1375460996-16329-4-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | | | Merge branch 'timers/nohz-v3' of ↵Ingo Molnar2013-08-1422-328/+544
| |\ \ \ \ \ \ \ | | |_|_|_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/nohz Pull nohz improvements from Frederic Weisbecker: " It mostly contains fixes and full dynticks off-case optimizations. I believe that distros want to enable this feature so it seems important to optimize the case where the "nohz_full=" parameter is empty. ie: I'm trying to remove any performance regression that comes with NO_HZ_FULL=y when the feature is not used. This patchset improves the current situation a lot (off-case appears to be around 11% faster with hackbench, although I guess it may vary depending on the configuration but it should be significantly faster in any case) now there is still some work to do: I can still observe a remaining loss of 1.6% throughput seen with hackbench compared to CONFIG_NO_HZ_FULL=n. " Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | | | | nohz: Optimize full dynticks's sched hooks with static keysFrederic Weisbecker2013-08-142-8/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Scheduler IPIs and task context switches are serious fast path. Let's try to hide as much as we can the impact of full dynticks APIs' off case that are called on these sites through the use of static keys. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>
| | * | | | | | nohz: Optimize full dynticks state checks with static keysFrederic Weisbecker2013-08-142-14/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These APIs are frequenctly accessed and priority is given to optimize the full dynticks off-case in order to let distros enable this feature without suffering from significant performance regressions. Let's inline these APIs and optimize them with static keys. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>
| | * | | | | | nohz: Rename a few state variablesFrederic Weisbecker2013-08-141-22/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename the full dynticks's cpumask and cpumask state variables to some more exportable names. These will be used later from global headers to optimize the main full dynticks APIs in conjunction with static keys. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>
| | * | | | | | vtime: Always debug check snapshot source _before_ updating itFrederic Weisbecker2013-08-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The vtime delta update performed by get_vtime_delta() always check that the source of the snapshot is valid. Meanhile the snapshot updaters that rely on get_vtime_delta() also set the new snapshot origin. But some of them do this right before the call to get_vtime_delta(), making its debug check useless. This is easily fixable by moving the snapshot origin update after the call to get_vtime_delta(). The order doesn't matter there. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>
| | * | | | | | vtime: Always scale generic vtime accounting resultsFrederic Weisbecker2013-08-141-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cputime accounting in full dynticks can be a subtle mixup of CPUs using tick based accounting and others using generic vtime. As long as the tick can have a share on producing these stats, we want to scale the result against CFS precise accounting as the tick can miss some task hiding between the periodic interrupt. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>
| | * | | | | | vtime: Optimize full dynticks accounting off case with static keysFrederic Weisbecker2013-08-143-31/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If no CPU is in the full dynticks range, we can avoid the full dynticks cputime accounting through generic vtime along with its overhead and use the traditional tick based accounting instead. Let's do this and nope the off case with static keys. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kevin Hilman <khilman@linaro.org>