diff options
author | Andi Kleen <andi@firstfloor.org> | 2009-02-12 13:49:36 +0100 |
---|---|---|
committer | H. Peter Anvin <hpa@zytor.com> | 2009-02-24 22:41:00 +0100 |
commit | 88ccbedd9ca85d1aca6a6f99df48dce87b7c02d4 (patch) | |
tree | 9951e6f3554789523006f187e69286f5ed541b50 /arch/x86/include/asm/mce.h | |
parent | x86, mce, cmci: define MSR names and fields for new CMCI registers (diff) | |
download | linux-88ccbedd9ca85d1aca6a6f99df48dce87b7c02d4.tar.xz linux-88ccbedd9ca85d1aca6a6f99df48dce87b7c02d4.zip |
x86, mce, cmci: add CMCI support
Impact: Major new feature
Intel CMCI (Corrected Machine Check Interrupt) is a new
feature on Nehalem CPUs. It allows the CPU to trigger
interrupts on corrected events, which allows faster
reaction to them instead of with the traditional
polling timer.
Also use CMCI to discover shared banks. Machine check banks
can be shared by CPU threads or even cores. Using the CMCI enable
bit it is possible to detect the fact that another CPU already
saw a specific bank. Use this to assign shared banks only
to one CPU to avoid reporting duplicated events.
On CPU hot unplug bank sharing is re discovered. This is done
using a thread that cycles through all the CPUs.
To avoid races between the poller and CMCI we only poll
for banks that are not CMCI capable and only check CMCI
owned banks on a interrupt.
The shared banks ownership information is currently only used for
CMCI interrupts, not polled banks.
The sharing discovery code follows the algorithm recommended in the
IA32 SDM Vol3a 14.5.2.1
The CMCI interrupt handler just calls the machine check poller to
pick up the machine check event that caused the interrupt.
I decided not to implement a separate threshold event like
the AMD version has, because the threshold is always one currently
and adding another event didn't seem to add any value.
Some code inspired by Yunhong Jiang's Xen implementation,
which was in term inspired by a earlier CMCI implementation
by me.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Diffstat (limited to 'arch/x86/include/asm/mce.h')
-rw-r--r-- | arch/x86/include/asm/mce.h | 10 |
1 files changed, 10 insertions, 0 deletions
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 6fc5e07eca4f..563933e06a35 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -105,8 +105,16 @@ extern void (*threshold_cpu_callback)(unsigned long action, unsigned int cpu); #ifdef CONFIG_X86_MCE_INTEL void mce_intel_feature_init(struct cpuinfo_x86 *c); +void cmci_clear(void); +void cmci_reenable(void); +void cmci_rediscover(int dying); +void cmci_recheck(void); #else static inline void mce_intel_feature_init(struct cpuinfo_x86 *c) { } +static inline void cmci_clear(void) {} +static inline void cmci_reenable(void) {} +static inline void cmci_rediscover(int dying) {} +static inline void cmci_recheck(void) {} #endif #ifdef CONFIG_X86_MCE_AMD @@ -115,6 +123,8 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c); static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { } #endif +extern int mce_available(struct cpuinfo_x86 *c); + void mce_log_therm_throt_event(__u64 status); extern atomic_t mce_entry; |