summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorWaiman Long <Waiman.Long@hp.com>2014-08-07 01:05:38 +0200
committerLinus Torvalds <torvalds@linux-foundation.org>2014-08-07 03:01:17 +0200
commit3a79d52aa3c63c939f5a1f86e80e634f84e987c4 (patch)
treeea352191eb8bc6afbc98ac3b54d573c1db615c78
parentmm, thp: move invariant bug check out of loop in __split_huge_page_map (diff)
downloadlinux-3a79d52aa3c63c939f5a1f86e80e634f84e987c4.tar.xz
linux-3a79d52aa3c63c939f5a1f86e80e634f84e987c4.zip
mm, thp: replace smp_mb after atomic_add by smp_mb__after_atomic
In some architectures like x86, atomic_add() is a full memory barrier. In that case, an additional smp_mb() is just a waste of time. This patch replaces that smp_mb() by smp_mb__after_atomic() which will avoid the redundant memory barrier in some architectures. With a 3.16-rc1 based kernel, this patch reduced the execution time of breaking 1000 transparent huge pages from 38,245us to 30,964us. A reduction of 19% which is quite sizeable. It also reduces the %cpu time of the __split_huge_page_refcount function in the perf profile from 2.18% to 1.15%. Signed-off-by: Waiman Long <Waiman.Long@hp.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Scott J Norton <scott.norton@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r--mm/huge_memory.c2
1 files changed, 1 insertions, 1 deletions
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 2161490526f0..4b95ff4120f5 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1681,7 +1681,7 @@ static void __split_huge_page_refcount(struct page *page,
&page_tail->_count);
/* after clearing PageTail the gup refcount can be released */
- smp_mb();
+ smp_mb__after_atomic();
/*
* retain hwpoison flag of the poisoned tail page: