diff options
author | Chris Mason <chris.mason@oracle.com> | 2010-12-13 20:56:23 +0100 |
---|---|---|
committer | Chris Mason <chris.mason@oracle.com> | 2010-12-14 02:06:52 +0100 |
commit | cd02dca56442e1504fd6bc5b96f7f1870162b266 (patch) | |
tree | 1a38d99fc581974ba6d8136c42ca81f3b1216ea3 /fs/btrfs/volumes.c | |
parent | Btrfs: EIO when we fail to read tree roots (diff) | |
download | linux-cd02dca56442e1504fd6bc5b96f7f1870162b266.tar.xz linux-cd02dca56442e1504fd6bc5b96f7f1870162b266.zip |
Btrfs: account for missing devices in RAID allocation profiles
When we mount in RAID degraded mode without adding a new device to
replace the failed one, we can end up using the wrong RAID flags for
allocations.
This results in strange combinations of block groups (raid1 in a raid10
filesystem) and corruptions when we try to allocate blocks from single
spindle chunks on drives that are actually missing.
The first device has two small 4MB chunks in it that mkfs creates and
these are usually unused in a raid1 or raid10 setup. But, in -o degraded,
the allocator will fall back to these because the mask of desired raid groups
isn't correct.
The fix here is to count the missing devices as we build up the list
of devices in the system. This count is used when picking the
raid level to make sure we continue using the same levels that were
in place before we lost a drive.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Diffstat (limited to 'fs/btrfs/volumes.c')
-rw-r--r-- | fs/btrfs/volumes.c | 20 |
1 files changed, 19 insertions, 1 deletions
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 91851b555e2e..177b73179590 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -413,12 +413,16 @@ static noinline int device_list_add(const char *path, device->fs_devices = fs_devices; fs_devices->num_devices++; - } else if (strcmp(device->name, path)) { + } else if (!device->name || strcmp(device->name, path)) { name = kstrdup(path, GFP_NOFS); if (!name) return -ENOMEM; kfree(device->name); device->name = name; + if (device->missing) { + fs_devices->missing_devices--; + device->missing = 0; + } } if (found_transid > fs_devices->latest_trans) { @@ -1238,6 +1242,9 @@ int btrfs_rm_device(struct btrfs_root *root, char *device_path) device->fs_devices->num_devices--; + if (device->missing) + root->fs_info->fs_devices->missing_devices--; + next_device = list_entry(root->fs_info->fs_devices->devices.next, struct btrfs_device, dev_list); if (device->bdev == root->fs_info->sb->s_bdev) @@ -3084,7 +3091,9 @@ static struct btrfs_device *add_missing_dev(struct btrfs_root *root, device->devid = devid; device->work.func = pending_bios_fn; device->fs_devices = fs_devices; + device->missing = 1; fs_devices->num_devices++; + fs_devices->missing_devices++; spin_lock_init(&device->io_lock); INIT_LIST_HEAD(&device->dev_alloc_list); memcpy(device->uuid, dev_uuid, BTRFS_UUID_SIZE); @@ -3282,6 +3291,15 @@ static int read_one_dev(struct btrfs_root *root, device = add_missing_dev(root, devid, dev_uuid); if (!device) return -ENOMEM; + } else if (!device->missing) { + /* + * this happens when a device that was properly setup + * in the device info lists suddenly goes bad. + * device->bdev is NULL, and so we have to set + * device->missing to one here + */ + root->fs_info->fs_devices->missing_devices++; + device->missing = 1; } } |