diff options
author | Matthew Wilcox <mawilcox@microsoft.com> | 2016-12-15 00:08:49 +0100 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2016-12-15 01:04:10 +0100 |
commit | 148deab223b23734069abcacb5c7118b0e7deadc (patch) | |
tree | 56aae9c91802e9262312f7fcbb3571eb9c8ec0e9 /mm/shmem.c | |
parent | btrfs: fix race in btrfs_free_dummy_fs_info() (diff) | |
download | linux-148deab223b23734069abcacb5c7118b0e7deadc.tar.xz linux-148deab223b23734069abcacb5c7118b0e7deadc.zip |
radix-tree: improve multiorder iterators
This fixes several interlinked problems with the iterators in the
presence of multiorder entries.
1. radix_tree_iter_next() would only advance by one slot, which would
result in the iterators returning the same entry more than once if
there were sibling entries.
2. radix_tree_next_slot() could return an internal pointer instead of
a user pointer if a tagged multiorder entry was immediately followed by
an entry of lower order.
3. radix_tree_next_slot() expanded to a lot more code than it used to
when multiorder support was compiled in. And I wasn't comfortable with
entry_to_node() being in a header file.
Fixing radix_tree_iter_next() for the presence of sibling entries
necessarily involves examining the contents of the radix tree, so we now
need to pass 'slot' to radix_tree_iter_next(), and we need to change the
calling convention so it is called *before* dropping the lock which
protects the tree. Also rename it to radix_tree_iter_resume(), as some
people thought it was necessary to call radix_tree_iter_next() each time
around the loop.
radix_tree_next_slot() becomes closer to how it looked before multiorder
support was introduced. It only checks to see if the next entry in the
chunk is a sibling entry or a pointer to a node; this should be rare
enough that handling this case out of line is not a performance impact
(and such impact is amortised by the fact that the entry we just
processed was a multiorder entry). Also, radix_tree_next_slot() used to
force a new chunk lookup for untagged entries, which is more expensive
than the out of line sibling entry skipping.
Link: http://lkml.kernel.org/r/1480369871-5271-55-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/shmem.c')
-rw-r--r-- | mm/shmem.c | 6 |
1 files changed, 3 insertions, 3 deletions
diff --git a/mm/shmem.c b/mm/shmem.c index abd7403aba41..be11c6dd414a 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -661,8 +661,8 @@ unsigned long shmem_partial_swap_usage(struct address_space *mapping, swapped++; if (need_resched()) { + slot = radix_tree_iter_resume(slot, &iter); cond_resched_rcu(); - slot = radix_tree_iter_next(&iter); } } @@ -2447,8 +2447,8 @@ static void shmem_tag_pins(struct address_space *mapping) } if (need_resched()) { + slot = radix_tree_iter_resume(slot, &iter); cond_resched_rcu(); - slot = radix_tree_iter_next(&iter); } } rcu_read_unlock(); @@ -2517,8 +2517,8 @@ static int shmem_wait_for_pins(struct address_space *mapping) spin_unlock_irq(&mapping->tree_lock); continue_resched: if (need_resched()) { + slot = radix_tree_iter_resume(slot, &iter); cond_resched_rcu(); - slot = radix_tree_iter_next(&iter); } } rcu_read_unlock(); |