diff options
author | Brian Foster <bfoster@redhat.com> | 2019-02-01 18:14:24 +0100 |
---|---|---|
committer | Darrick J. Wong <darrick.wong@oracle.com> | 2019-02-12 01:07:01 +0100 |
commit | c2b3164320b51a535d7c7a6acdcee255edbb22cf (patch) | |
tree | 29d264749af4414018655c5fbba5cf3d859eb15a /fs/xfs/libxfs | |
parent | xfs: create delalloc bmapi wrapper for full extent allocation (diff) | |
download | linux-c2b3164320b51a535d7c7a6acdcee255edbb22cf.tar.xz linux-c2b3164320b51a535d7c7a6acdcee255edbb22cf.zip |
xfs: use the latest extent at writeback delalloc conversion time
The writeback delalloc conversion code is racy with respect to
changes in the currently cached file mapping outside of the current
page. This is because the ilock is cycled between the time the
caller originally looked up the mapping and across each real
allocation of the provided file range. This code has collected
various hacks over the years to help combat the symptoms of these
races (i.e., truncate race detection, allocation into hole
detection, etc.), but none address the fundamental problem that the
imap may not be valid at allocation time.
Rather than continue to use race detection hacks, update writeback
delalloc conversion to a model that explicitly converts the delalloc
extent backing the current file offset being processed. The current
file offset is the only block we can trust to remain once the ilock
is dropped because any operation that can remove the block
(truncate, hole punch, etc.) must flush and discard pagecache pages
first.
Modify xfs_iomap_write_allocate() to use the xfs_bmapi_delalloc()
mechanism to request allocation of the entire delalloc extent
backing the current offset instead of assuming the extent passed by
the caller is unchanged. Record the range specified by the caller
and apply it to the resulting allocated extent so previous checks by
the caller for COW fork overlap are not lost. Finally, overload the
bmapi delalloc flag with the range reval flag behavior since this is
the only use case for both.
This ensures that writeback always picks up the correct
and current extent associated with the page, regardless of races
with other extent modifying operations. If operating on a data fork
and the COW overlap state has changed since the ilock was cycled,
the caller revalidates against the COW fork sequence number before
using the imap for the next block.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Diffstat (limited to 'fs/xfs/libxfs')
-rw-r--r-- | fs/xfs/libxfs/xfs_bmap.c | 16 | ||||
-rw-r--r-- | fs/xfs/libxfs/xfs_bmap.h | 2 |
2 files changed, 7 insertions, 11 deletions
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index c629004d9a4c..f4a65330a2a9 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -4296,15 +4296,14 @@ xfs_bmapi_write( bma.datatype = 0; /* - * The reval flag means the caller wants to allocate the entire delalloc - * extent backing bno where bno may not necessarily match the startoff. - * Now that we've looked up the extent, reset the range to map based on - * the extent in the file. If we're in a hole, this may be an error so - * don't adjust anything. + * The delalloc flag means the caller wants to allocate the entire + * delalloc extent backing bno where bno may not necessarily match the + * startoff. Now that we've looked up the extent, reset the range to + * map based on the extent in the file. If we're in a hole, this may be + * an error so don't adjust anything. */ - if ((flags & XFS_BMAPI_REVALRANGE) && + if ((flags & XFS_BMAPI_DELALLOC) && !eof && bno >= bma.got.br_startoff) { - ASSERT(flags & XFS_BMAPI_DELALLOC); bno = bma.got.br_startoff; len = bma.got.br_blockcount; #ifdef DEBUG @@ -4495,10 +4494,9 @@ xfs_bmapi_convert_delalloc( flags |= XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC; /* - * The reval flag means to allocate the entire extent; pass a dummy + * The delalloc flag means to allocate the entire extent; pass a dummy * length of 1. */ - flags |= XFS_BMAPI_REVALRANGE; error = xfs_bmapi_write(tp, ip, offset_fsb, 1, flags, total, imap, &nimaps); if (!error && !nimaps) diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h index 75586d56f7a5..4dc7d1a02b35 100644 --- a/fs/xfs/libxfs/xfs_bmap.h +++ b/fs/xfs/libxfs/xfs_bmap.h @@ -107,8 +107,6 @@ struct xfs_extent_free_item /* Do not update the rmap btree. Used for reconstructing bmbt from rmapbt. */ #define XFS_BMAPI_NORMAP 0x2000 -#define XFS_BMAPI_REVALRANGE 0x4000 - #define XFS_BMAPI_FLAGS \ { XFS_BMAPI_ENTIRE, "ENTIRE" }, \ { XFS_BMAPI_METADATA, "METADATA" }, \ |