summaryrefslogtreecommitdiffstats
path: root/drivers/md/raid5.c
diff options
context:
space:
mode:
authorShaohua Li <shli@fb.com>2015-08-13 23:31:59 +0200
committerNeilBrown <neilb@suse.com>2015-10-24 08:16:19 +0200
commitf6bed0ef0a808164f51197de062e0450ce6c1f96 (patch)
tree9fc2ec276f40ef36eaa14e295f543595472b56a9 /drivers/md/raid5.c
parentraid5: add a new state for stripe log handling (diff)
downloadlinux-f6bed0ef0a808164f51197de062e0450ce6c1f96.tar.xz
linux-f6bed0ef0a808164f51197de062e0450ce6c1f96.zip
raid5: add basic stripe log
This introduces a simple log for raid5. Data/parity writing to raid array first writes to the log, then write to raid array disks. If crash happens, we can recovery data from the log. This can speed up raid resync and fix write hole issue. The log structure is pretty simple. Data/meta data is stored in block unit, which is 4k generally. It has only one type of meta data block. The meta data block can track 3 types of data, stripe data, stripe parity and flush block. MD superblock will point to the last valid meta data block. Each meta data block has checksum/seq number, so recovery can scan the log correctly. We store a checksum of stripe data/parity to the metadata block, so meta data and stripe data/parity can be written to log disk together. otherwise, meta data write must wait till stripe data/parity is finished. For stripe data, meta data block will record stripe data sector and size. Currently the size is always 4k. This meta data record can be made simpler if we just fix write hole (eg, we can record data of a stripe's different disks together), but this format can be extended to support caching in the future, which must record data address/size. For stripe parity, meta data block will record stripe sector. It's size should be 4k (for raid5) or 8k (for raid6). We always store p parity first. This format should work for caching too. flush block indicates a stripe is in raid array disks. Fixing write hole doesn't need this type of meta data, it's for caching extension. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
Diffstat (limited to 'drivers/md/raid5.c')
-rw-r--r--drivers/md/raid5.c4
1 files changed, 4 insertions, 0 deletions
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 4b789f1f4550..64a256538ff7 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -895,6 +895,8 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
might_sleep();
+ if (r5l_write_stripe(conf->log, sh) == 0)
+ return;
for (i = disks; i--; ) {
int rw;
int replace_only = 0;
@@ -3495,6 +3497,7 @@ returnbi:
WARN_ON(test_bit(R5_SkipCopy, &dev->flags));
WARN_ON(dev->page != dev->orig_page);
}
+
if (!discard_pending &&
test_bit(R5_Discard, &sh->dev[sh->pd_idx].flags)) {
clear_bit(R5_Discard, &sh->dev[sh->pd_idx].flags);
@@ -5745,6 +5748,7 @@ static int handle_active_stripes(struct r5conf *conf, int group,
for (i = 0; i < batch_size; i++)
handle_stripe(batch[i]);
+ r5l_write_stripe_run(conf->log);
cond_resched();