[PATCH] .text page fault SMP scalability optimization

We had a problem on ppc64 where with more than 4 threads a large system wouldn't scale well while faulting in the .text (most of the time was spent in the kernel despite it was an userland compute intensive app). The reason is the useless overwrite of the same pte from all cpu. I fixed it this way (verified on an older kernel but the forward port is almost identical). This will benefit all archs not just ppc64. Signed-off-by: Andrea Arcangeli <andrea@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
author: Andrea Arcangeli <andrea@suse.de> 2005-10-30 02:16:48 +0100
committer: Linus Torvalds <torvalds@g5.osdl.org> 2005-10-30 05:40:43 +0100
commit: 1a44e149084d772a1bcf4cdbdde8a013a8a1cfde (patch)
tree: b3f682ce8df89edb9740fdd5c178df5accc49736 /mm/memory.c
parent: [PATCH] hugetlb: overcommit accounting check (diff)
download: linux-1a44e149084d772a1bcf4cdbdde8a013a8a1cfde.tar.xz
linux-1a44e149084d772a1bcf4cdbdde8a013a8a1cfde.zip
1 files changed, 16 insertions, 4 deletions
diff --git a/mm/memory.c b/mm/memory.c
index d68421dd64ef..0f60baf6f69b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1980,9 +1980,10 @@ static inline int handle_pte_fault(struct mm_struct *mm,
 		pte_t *pte, pmd_t *pmd, int write_access)
 {
 	pte_t entry;
+	pte_t old_entry;
 	spinlock_t *ptl;
 
-	entry = *pte;
+	old_entry = entry = *pte;
 	if (!pte_present(entry)) {
 		if (pte_none(entry)) {
 			if (!vma->vm_ops || !vma->vm_ops->nopage)
@@ -2009,9 +2010,20 @@ static inline int handle_pte_fault(struct mm_struct *mm,
 		entry = pte_mkdirty(entry);
 	}
 	entry = pte_mkyoung(entry);
-	ptep_set_access_flags(vma, address, pte, entry, write_access);
-	update_mmu_cache(vma, address, entry);
-	lazy_mmu_prot_update(entry);
+	if (!pte_same(old_entry, entry)) {
+		ptep_set_access_flags(vma, address, pte, entry, write_access);
+		update_mmu_cache(vma, address, entry);
+		lazy_mmu_prot_update(entry);
+	} else {
+		/*
+		 * This is needed only for protection faults but the arch code
+		 * is not yet telling us if this is a protection fault or not.
+		 * This still avoids useless tlb flushes for .text page faults
+		 * with threads.
+		 */
+		if (write_access)
+			flush_tlb_page(vma, address);
+	}
 unlock:
 	pte_unmap_unlock(pte, ptl);
 	return VM_FAULT_MINOR;
author	Andrea Arcangeli <andrea@suse.de>	2005-10-30 02:16:48 +0100
committer	Linus Torvalds <torvalds@g5.osdl.org>	2005-10-30 05:40:43 +0100
commit	1a44e149084d772a1bcf4cdbdde8a013a8a1cfde (patch)
tree	b3f682ce8df89edb9740fdd5c178df5accc49736 /mm/memory.c
parent	[PATCH] hugetlb: overcommit accounting check (diff)
download	linux-1a44e149084d772a1bcf4cdbdde8a013a8a1cfde.tar.xz linux-1a44e149084d772a1bcf4cdbdde8a013a8a1cfde.zip