From 816e3599ca9b9bbfdc456433cc707e75f2c31104 Mon Sep 17 00:00:00 2001 From: Dave Chinner Date: Tue, 13 Aug 2024 09:39:38 +0200 Subject: xfs: don't free post-EOF blocks on read close When we have a workload that does open/read/close in parallel with other allocation, the file becomes rapidly fragmented. This is due to close() calling xfs_file_release() and removing the speculative preallocation beyond EOF. Add a check for a writable context to xfs_file_release to skip the post-EOF block freeing (an the similarly pointless flushing on truncate down). Before: Test 1: sync write fragmentation counts /mnt/scratch/file.0: 919 /mnt/scratch/file.1: 916 /mnt/scratch/file.2: 919 /mnt/scratch/file.3: 920 /mnt/scratch/file.4: 920 /mnt/scratch/file.5: 921 /mnt/scratch/file.6: 916 /mnt/scratch/file.7: 918 After: Test 1: sync write fragmentation counts /mnt/scratch/file.0: 24 /mnt/scratch/file.1: 24 /mnt/scratch/file.2: 11 /mnt/scratch/file.3: 24 /mnt/scratch/file.4: 3 /mnt/scratch/file.5: 24 /mnt/scratch/file.6: 24 /mnt/scratch/file.7: 23 Signed-off-by: Dave Chinner [darrick: wordsmithing, fix commit message] Signed-off-by: Darrick J. Wong [hch: ported to the new ->release code structure] Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Chandan Babu R --- fs/xfs/xfs_file.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) (limited to 'fs') diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index dae8dd122355..60424e642307 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1217,12 +1217,18 @@ xfs_file_release( * There is no point in freeing blocks here for open but unlinked files * as they will be taken care of by the inactivation path soon. * + * When releasing a read-only context, don't flush data or trim post-EOF + * blocks. This avoids open/read/close workloads from removing EOF + * blocks that other writers depend upon to reduce fragmentation. + * * If we can't get the iolock just skip truncating the blocks past EOF * because we could deadlock with the mmap_lock otherwise. We'll get * another chance to drop them once the last reference to the inode is * dropped, so we'll never leak blocks permanently. */ - if (inode->i_nlink && xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL)) { + if (inode->i_nlink && + (file->f_mode & FMODE_WRITE) && + xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL)) { if (xfs_can_free_eofblocks(ip) && !xfs_iflags_test(ip, XFS_IDIRTY_RELEASE)) { /* -- cgit v1.2.3