diff options
author | Jan Kara <jack@suse.cz> | 2018-10-30 23:10:47 +0100 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2018-10-31 16:54:17 +0100 |
commit | f2c57d91b0d96aa13ccff4e3b178038f17b00658 (patch) | |
tree | b703f6e9b565550cb11eb1cb9f21a33d396a6257 /mm/memory.c | |
parent | memory-hotplug.rst: add some details about locking internals (diff) | |
download | linux-f2c57d91b0d96aa13ccff4e3b178038f17b00658.tar.xz linux-f2c57d91b0d96aa13ccff4e3b178038f17b00658.zip |
mm: Fix warning in insert_pfn()
In DAX mode a write pagefault can race with write(2) in the following
way:
CPU0 CPU1
write fault for mapped zero page (hole)
dax_iomap_rw()
iomap_apply()
xfs_file_iomap_begin()
- allocates blocks
dax_iomap_actor()
invalidate_inode_pages2_range()
- invalidates radix tree entries in given range
dax_iomap_pte_fault()
grab_mapping_entry()
- no entry found, creates empty
...
xfs_file_iomap_begin()
- finds already allocated block
...
vmf_insert_mixed_mkwrite()
- WARNs and does nothing because there
is still zero page mapped in PTE
unmap_mapping_pages()
This race results in WARN_ON from insert_pfn() and is occasionally
triggered by fstest generic/344. Note that the race is otherwise
harmless as before write(2) on CPU0 is finished, we will invalidate page
tables properly and thus user of mmap will see modified data from
write(2) from that point on. So just restrict the warning only to the
case when the PFN in PTE is not zero page.
Link: http://lkml.kernel.org/r/20180824154542.26872-1-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/memory.c')
-rw-r--r-- | mm/memory.c | 9 |
1 files changed, 7 insertions, 2 deletions
diff --git a/mm/memory.c b/mm/memory.c index 072139579d89..4ad2d293ddc2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1537,10 +1537,15 @@ static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr, * in may not match the PFN we have mapped if the * mapped PFN is a writeable COW page. In the mkwrite * case we are creating a writable PTE for a shared - * mapping and we expect the PFNs to match. + * mapping and we expect the PFNs to match. If they + * don't match, we are likely racing with block + * allocation and mapping invalidation so just skip the + * update. */ - if (WARN_ON_ONCE(pte_pfn(*pte) != pfn_t_to_pfn(pfn))) + if (pte_pfn(*pte) != pfn_t_to_pfn(pfn)) { + WARN_ON_ONCE(!is_zero_pfn(pte_pfn(*pte))); goto out_unlock; + } entry = *pte; goto out_mkwrite; } else |