mm/munlock: mlock_page() munlock_page() batch by pagevec

A weakness of the page->mlock_count approach is the need for lruvec lock while holding page table lock. That is not an overhead we would allow on normal pages, but I think acceptable just for pages in an mlocked area. But let's try to amortize the extra cost by gathering on per-cpu pagevec before acquiring the lruvec lock. I have an unverified conjecture that the mlock pagevec might work out well for delaying the mlock processing of new file pages until they have got off lru_cache_add()'s pagevec and on to LRU. The initialization of page->mlock_count is subject to races and awkward: 0 or !!PageMlocked or 1? Was it wrong even in the implementation before this commit, which just widens the window? I haven't gone back to think it through. Maybe someone can point out a better way to initialize it. Bringing lru_cache_add_inactive_or_unevictable()'s mlock initialization into mm/mlock.c has helped: mlock_new_page(), using the mlock pagevec, rather than lru_cache_add()'s pagevec. Experimented with various orderings: the right thing seems to be for mlock_page() and mlock_new_page() to TestSetPageMlocked before adding to pagevec, but munlock_page() to leave TestClearPageMlocked to the later pagevec processing. Dropped the VM_BUG_ON_PAGE(PageTail)s this time around: they have made their point, and the thp_nr_page()s already contain a VM_BUG_ON_PGFLAGS() for that. This still leaves acquiring lruvec locks under page table lock each time the pagevec fills (or a THP is added): which I suppose is rather silly, since they sit on pagevec waiting to be processed long after page table lock has been dropped; but I'm disinclined to uglify the calling sequence until some load shows an actual problem with it (nothing wrong with taking lruvec lock under page table lock, just "nicer" to do it less). Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
author: Hugh Dickins <hughd@google.com> 2022-02-15 03:37:29 +0100
committer: Matthew Wilcox (Oracle) <willy@infradead.org> 2022-02-17 17:59:22 +0100
commit: 2fbb0c10d1e8222604132b3a3f81bfd8345a44b6 (patch)
tree: 0ae993e92b32cd206e33b54abafac6902c69d4ee /mm/internal.h
parent: mm/munlock: delete smp_mb() from __pagevec_lru_add_fn() (diff)
download: linux-2fbb0c10d1e8222604132b3a3f81bfd8345a44b6.tar.xz
linux-2fbb0c10d1e8222604132b3a3f81bfd8345a44b6.zip
1 files changed, 8 insertions, 1 deletions
diff --git a/mm/internal.h b/mm/internal.h
index b3f0dd3ffba2..18af980bb1b8 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -402,7 +402,8 @@ extern int mlock_future_check(struct mm_struct *mm, unsigned long flags,
  *
  * mlock is usually called at the end of page_add_*_rmap(),
  * munlock at the end of page_remove_rmap(); but new anon
- * pages are managed in lru_cache_add_inactive_or_unevictable().
+ * pages are managed by lru_cache_add_inactive_or_unevictable()
+ * calling mlock_new_page().
  *
  * @compound is used to include pmd mappings of THPs, but filter out
  * pte mappings of THPs, which cannot be consistently counted: a pte
@@ -425,6 +426,9 @@ static inline void munlock_vma_page(struct page *page,
 	    (compound || !PageTransCompound(page)))
 		munlock_page(page);
 }
+void mlock_new_page(struct page *page);
+bool need_mlock_page_drain(int cpu);
+void mlock_page_drain(int cpu);
 
 extern pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma);
 
@@ -503,6 +507,9 @@ static inline void mlock_vma_page(struct page *page,
 			struct vm_area_struct *vma, bool compound) { }
 static inline void munlock_vma_page(struct page *page,
 			struct vm_area_struct *vma, bool compound) { }
+static inline void mlock_new_page(struct page *page) { }
+static inline bool need_mlock_page_drain(int cpu) { return false; }
+static inline void mlock_page_drain(int cpu) { }
 static inline void vunmap_range_noflush(unsigned long start, unsigned long end)
 {
 }
author	Hugh Dickins <hughd@google.com>	2022-02-15 03:37:29 +0100
committer	Matthew Wilcox (Oracle) <willy@infradead.org>	2022-02-17 17:59:22 +0100
commit	2fbb0c10d1e8222604132b3a3f81bfd8345a44b6 (patch)
tree	0ae993e92b32cd206e33b54abafac6902c69d4ee /mm/internal.h
parent	mm/munlock: delete smp_mb() from __pagevec_lru_add_fn() (diff)
download	linux-2fbb0c10d1e8222604132b3a3f81bfd8345a44b6.tar.xz linux-2fbb0c10d1e8222604132b3a3f81bfd8345a44b6.zip