summaryrefslogtreecommitdiffstats
path: root/fs/btrfs/tree-log.h
diff options
context:
space:
mode:
authorFilipe Manana <fdmanana@suse.com>2014-09-05 16:14:39 +0200
committerChris Mason <clm@fb.com>2014-09-19 15:57:51 +0200
commit8407f553268a4611f2542ed90677f0edfaa2c9c4 (patch)
tree2d0c17df51443c5d5e2cde9c079362e840f5c637 /fs/btrfs/tree-log.h
parentBtrfs: fix fsync race leading to invalid data after log replay (diff)
downloadlinux-8407f553268a4611f2542ed90677f0edfaa2c9c4.tar.xz
linux-8407f553268a4611f2542ed90677f0edfaa2c9c4.zip
Btrfs: fix data corruption after fast fsync and writeback error
When we do a fast fsync, we start all ordered operations and then while they're running in parallel we visit the list of modified extent maps and construct their matching file extent items and write them to the log btree. After that, in btrfs_sync_log() we wait for all the ordered operations to finish (via btrfs_wait_logged_extents). The problem with this is that we were completely ignoring errors that can happen in the extent write path, such as -ENOSPC, a temporary -ENOMEM or -EIO errors for example. When such error happens, it means we have parts of the on disk extent that weren't written to, and so we end up logging file extent items that point to these extents that contain garbage/random data - so after a crash/reboot plus log replay, we get our inode's metadata pointing to those extents. This worked in contrast with the full (non-fast) fsync path, where we start all ordered operations, wait for them to finish and then write to the log btree. In this path, after each ordered operation completes we check if it's flagged with an error (BTRFS_ORDERED_IOERR) and return -EIO if so (via btrfs_wait_ordered_range). So if an error happens with any ordered operation, just return a -EIO error to userspace, so that it knows that not all of its previous writes were durably persisted and the application can take proper action (like redo the writes for e.g.) - and definitely not leave any file extent items in the log refer to non fully written extents. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
Diffstat (limited to 'fs/btrfs/tree-log.h')
-rw-r--r--fs/btrfs/tree-log.h2
1 files changed, 2 insertions, 0 deletions
diff --git a/fs/btrfs/tree-log.h b/fs/btrfs/tree-log.h
index e2e798ae7cd7..154990c26dcb 100644
--- a/fs/btrfs/tree-log.h
+++ b/fs/btrfs/tree-log.h
@@ -28,6 +28,7 @@
struct btrfs_log_ctx {
int log_ret;
int log_transid;
+ int io_err;
struct list_head list;
};
@@ -35,6 +36,7 @@ static inline void btrfs_init_log_ctx(struct btrfs_log_ctx *ctx)
{
ctx->log_ret = 0;
ctx->log_transid = 0;
+ ctx->io_err = 0;
INIT_LIST_HEAD(&ctx->list);
}