mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute

Some high end Intel Xeon systems report uncorrectable memory errors as a recoverable machine check. Linux has included code for some time to process these and just signal the affected processes (or even recover completely if the error was in a read only page that can be replaced by reading from disk). But we have no recovery path for errors encountered during kernel code execution. Except for some very specific cases were are unlikely to ever be able to recover. Enter memory mirroring. Actually 3rd generation of memory mirroing. Gen1: All memory is mirrored Pro: No s/w enabling - h/w just gets good data from other side of the mirror Con: Halves effective memory capacity available to OS/applications Gen2: Partial memory mirror - just mirror memory begind some memory controllers Pro: Keep more of the capacity Con: Nightmare to enable. Have to choose between allocating from mirrored memory for safety vs. NUMA local memory for performance Gen3: Address range partial memory mirror - some mirror on each memory controller Pro: Can tune the amount of mirror and keep NUMA performance Con: I have to write memory management code to implement The current plan is just to use mirrored memory for kernel allocations. This has been broken into two phases: 1) This patch series - find the mirrored memory, use it for boot time allocations 2) Wade into mm/page_alloc.c and define a ZONE_MIRROR to pick up the unused mirrored memory from mm/memblock.c and only give it out to select kernel allocations (this is still being scoped because page_alloc.c is scary). This patch (of 3): Add extra "flags" to memblock to allow selection of memory based on attribute. No functional changes Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Xiexiuqi <xiexiuqi@huawei.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
author: Tony Luck <tony.luck@intel.com> 2015-06-25 01:58:09 +0200
committer: Linus Torvalds <torvalds@linux-foundation.org> 2015-06-25 02:49:44 +0200
commit: fc6daaf93151877748f8096af6b3fddb147f22d6 (patch)
tree: 1892f34cca08d40af6598bccae87c42037c5ea80 /mm/nobootmem.c
parent: mm: do not ignore mapping_gfp_mask in page cache allocation paths (diff)
download: linux-fc6daaf93151877748f8096af6b3fddb147f22d6.tar.xz
linux-fc6daaf93151877748f8096af6b3fddb147f22d6.zip
1 files changed, 4 insertions, 2 deletions
diff --git a/mm/nobootmem.c b/mm/nobootmem.c
index 90b50468333e..ad3641dcdbe7 100644
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -41,7 +41,8 @@ static void * __init __alloc_memory_core_early(int nid, u64 size, u64 align,
 	if (limit > memblock.current_limit)
 		limit = memblock.current_limit;
 
-	addr = memblock_find_in_range_node(size, align, goal, limit, nid);
+	addr = memblock_find_in_range_node(size, align, goal, limit, nid,
+					   MEMBLOCK_NONE);
 	if (!addr)
 		return NULL;
 
@@ -121,7 +122,8 @@ static unsigned long __init free_low_memory_core_early(void)
 
 	memblock_clear_hotplug(0, -1);
 
-	for_each_free_mem_range(i, NUMA_NO_NODE, &start, &end, NULL)
+	for_each_free_mem_range(i, NUMA_NO_NODE, MEMBLOCK_NONE, &start, &end,
+				NULL)
 		count += __free_memory_core(start, end);
 
 #ifdef CONFIG_ARCH_DISCARD_MEMBLOCK
author	Tony Luck <tony.luck@intel.com>	2015-06-25 01:58:09 +0200
committer	Linus Torvalds <torvalds@linux-foundation.org>	2015-06-25 02:49:44 +0200
commit	fc6daaf93151877748f8096af6b3fddb147f22d6 (patch)
tree	1892f34cca08d40af6598bccae87c42037c5ea80 /mm/nobootmem.c
parent	mm: do not ignore mapping_gfp_mask in page cache allocation paths (diff)
download	linux-fc6daaf93151877748f8096af6b3fddb147f22d6.tar.xz linux-fc6daaf93151877748f8096af6b3fddb147f22d6.zip