summaryrefslogtreecommitdiffstats
path: root/mm
diff options
context:
space:
mode:
authorDave Chinner <dchinner@redhat.com>2016-07-27 00:21:50 +0200
committerLinus Torvalds <torvalds@linux-foundation.org>2016-07-27 01:19:19 +0200
commit6c60d2b5746cf23025ffe71bd7ff9075048fc90c (patch)
tree6794888cf362aa86c079ed5697c1ef7a6c117a1a /mm
parentocfs2/cluster: clean up unnecessary assignment for 'ret' (diff)
downloadlinux-6c60d2b5746cf23025ffe71bd7ff9075048fc90c.tar.xz
linux-6c60d2b5746cf23025ffe71bd7ff9075048fc90c.zip
fs/fs-writeback.c: add a new writeback list for sync
wait_sb_inodes() currently does a walk of all inodes in the filesystem to find dirty one to wait on during sync. This is highly inefficient and wastes a lot of CPU when there are lots of clean cached inodes that we don't need to wait on. To avoid this "all inode" walk, we need to track inodes that are currently under writeback that we need to wait for. We do this by adding inodes to a writeback list on the sb when the mapping is first tagged as having pages under writeback. wait_sb_inodes() can then walk this list of "inodes under IO" and wait specifically just for the inodes that the current sync(2) needs to wait for. Define a couple helpers to add/remove an inode from the writeback list and call them when the overall mapping is tagged for or cleared from writeback. Update wait_sb_inodes() to walk only the inodes under writeback due to the sync. With this change, filesystem sync times are significantly reduced for fs' with largely populated inode caches and otherwise no other work to do. For example, on a 16xcpu 2GHz x86-64 server, 10TB XFS filesystem with a ~10m entry inode cache, sync times are reduced from ~7.3s to less than 0.1s when the filesystem is fully clean. Link: http://lkml.kernel.org/r/1466594593-6757-2-git-send-email-bfoster@redhat.com Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Tested-by: Holger Hoffstätte <holger.hoffstaette@applied-asynchrony.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm')
-rw-r--r--mm/page-writeback.c18
1 files changed, 18 insertions, 0 deletions
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index e2481949494c..8195eb454411 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2747,6 +2747,11 @@ int test_clear_page_writeback(struct page *page)
__wb_writeout_inc(wb);
}
}
+
+ if (mapping->host && !mapping_tagged(mapping,
+ PAGECACHE_TAG_WRITEBACK))
+ sb_clear_inode_writeback(mapping->host);
+
spin_unlock_irqrestore(&mapping->tree_lock, flags);
} else {
ret = TestClearPageWriteback(page);
@@ -2774,11 +2779,24 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
spin_lock_irqsave(&mapping->tree_lock, flags);
ret = TestSetPageWriteback(page);
if (!ret) {
+ bool on_wblist;
+
+ on_wblist = mapping_tagged(mapping,
+ PAGECACHE_TAG_WRITEBACK);
+
radix_tree_tag_set(&mapping->page_tree,
page_index(page),
PAGECACHE_TAG_WRITEBACK);
if (bdi_cap_account_writeback(bdi))
__inc_wb_stat(inode_to_wb(inode), WB_WRITEBACK);
+
+ /*
+ * We can come through here when swapping anonymous
+ * pages, so we don't necessarily have an inode to track
+ * for sync.
+ */
+ if (mapping->host && !on_wblist)
+ sb_mark_inode_writeback(mapping->host);
}
if (!PageDirty(page))
radix_tree_tag_clear(&mapping->page_tree,