NUMA: optimize detection of memory with no node id assigned by firmware

Sanity check that makes sure the nodes cover all memory loops over numa_meminfo to count the pages that have node id assigned by the firmware, then loops again over memblock.memory to find the total amount of memory and in the end checks that the difference between the total memory and memory that covered by nodes is less than some threshold. Worse, the loop over numa_meminfo calls __absent_pages_in_range() that also partially traverses memblock.memory. It's much simpler and more efficient to have a single traversal of memblock.memory that verifies that amount of memory not covered by nodes is less than a threshold. Introduce memblock_validate_numa_coverage() that does exactly that and use it instead of numa_meminfo_cover_memory(). Link: https://lkml.kernel.org/r/20231026020329.327329-1-zhiguangni01@gmail.com Signed-off-by: Liam Ni <zhiguangni01@gmail.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Binbin Zhou <zhoubinbin@loongson.cn> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Feiyang Chen <chenfeiyang@loongson.cn> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: WANG Xuerui <kernel@xen0n.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
author: Liam Ni <zhiguangni01@gmail.com> 2023-10-26 04:03:29 +0200
committer: Andrew Morton <akpm@linux-foundation.org> 2023-12-11 01:51:34 +0100
commit: ff6c3d81f2e86b63a3a530683f89ef393882782a (patch)
tree: 2c2638f84588f6a313b8cb77aceaaefb4b9f14ed /arch/x86/mm
parent: mm: huge_memory: batch tlb flush when splitting a pte-mapped THP (diff)
download: linux-ff6c3d81f2e86b63a3a530683f89ef393882782a.tar.xz
linux-ff6c3d81f2e86b63a3a530683f89ef393882782a.zip
1 files changed, 2 insertions, 32 deletions
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index b29ceb19e46e..adc497b93f03 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -450,37 +450,6 @@ int __node_distance(int from, int to)
 EXPORT_SYMBOL(__node_distance);
 
 /*
- * Sanity check to catch more bad NUMA configurations (they are amazingly
- * common).  Make sure the nodes cover all memory.
- */
-static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
-{
-	u64 numaram, e820ram;
-	int i;
-
-	numaram = 0;
-	for (i = 0; i < mi->nr_blks; i++) {
-		u64 s = mi->blk[i].start >> PAGE_SHIFT;
-		u64 e = mi->blk[i].end >> PAGE_SHIFT;
-		numaram += e - s;
-		numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
-		if ((s64)numaram < 0)
-			numaram = 0;
-	}
-
-	e820ram = max_pfn - absent_pages_in_range(0, max_pfn);
-
-	/* We seem to lose 3 pages somewhere. Allow 1M of slack. */
-	if ((s64)(e820ram - numaram) >= (1 << (20 - PAGE_SHIFT))) {
-		printk(KERN_ERR "NUMA: nodes only cover %LuMB of your %LuMB e820 RAM. Not used.\n",
-		       (numaram << PAGE_SHIFT) >> 20,
-		       (e820ram << PAGE_SHIFT) >> 20);
-		return false;
-	}
-	return true;
-}
-
-/*
  * Mark all currently memblock-reserved physical memory (which covers the
  * kernel's own memory ranges) as hot-unswappable.
  */
@@ -585,7 +554,8 @@ static int __init numa_register_memblks(struct numa_meminfo *mi)
 			return -EINVAL;
 		}
 	}
-	if (!numa_meminfo_cover_memory(mi))
+
+	if (!memblock_validate_numa_coverage(SZ_1M))
 		return -EINVAL;
 
 	/* Finally register nodes. */
author	Liam Ni <zhiguangni01@gmail.com>	2023-10-26 04:03:29 +0200
committer	Andrew Morton <akpm@linux-foundation.org>	2023-12-11 01:51:34 +0100
commit	ff6c3d81f2e86b63a3a530683f89ef393882782a (patch)
tree	2c2638f84588f6a313b8cb77aceaaefb4b9f14ed /arch/x86/mm
parent	mm: huge_memory: batch tlb flush when splitting a pte-mapped THP (diff)
download	linux-ff6c3d81f2e86b63a3a530683f89ef393882782a.tar.xz linux-ff6c3d81f2e86b63a3a530683f89ef393882782a.zip