diff options
author | Lars Ellenberg <lars.ellenberg@linbit.com> | 2014-01-27 15:58:22 +0100 |
---|---|---|
committer | Philipp Reisner <philipp.reisner@linbit.com> | 2014-07-10 18:34:50 +0200 |
commit | 5ab7d2c005135849cf0bb1485d954c98f2cca57c (patch) | |
tree | 43340069d199c864871c04f9672639415a7bb8fa /drivers/block/drbd/drbd_state.c | |
parent | drbd: fix a race stopping the worker thread (diff) | |
download | linux-5ab7d2c005135849cf0bb1485d954c98f2cca57c.tar.xz linux-5ab7d2c005135849cf0bb1485d954c98f2cca57c.zip |
drbd: fix resync finished detection
This fixes one recent regresion,
and one long existing bug.
The bug:
drbd_try_clear_on_disk_bm() assumed that all "count" bits have to be
accounted in the resync extent corresponding to the start sector.
Since we allow application requests to cross our "extent" boundaries,
this assumption is no longer true, resulting in possible misaccounting,
scary messages
("BAD! sector=12345s enr=6 rs_left=-7 rs_failed=0 count=58 cstate=..."),
and potentially, if the last bit to be cleared during resync would
reside in previously misaccounted resync extent, the resync would never
be recognized as finished, but would be "stalled" forever, even though
all blocks are in sync again and all bits have been cleared...
The regression was introduced by
drbd: get rid of atomic update on disk bitmap works
For an "empty" resync (rs_total == 0), we must not "finish" the
resync on the SyncSource before the SyncTarget knows all relevant
information (sync uuid). We need to wait for the full round-trip,
the SyncTarget will then explicitly notify us.
Also for normal, non-empty resyncs (rs_total > 0), the resync-finished
condition needs to be tested before the schedule() in wait_for_work, or
it is likely to be missed.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Diffstat (limited to 'drivers/block/drbd/drbd_state.c')
-rw-r--r-- | drivers/block/drbd/drbd_state.c | 3 |
1 files changed, 3 insertions, 0 deletions
diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c index 19da7c7590cd..1bddd6cf8ac7 100644 --- a/drivers/block/drbd/drbd_state.c +++ b/drivers/block/drbd/drbd_state.c @@ -1011,6 +1011,9 @@ __drbd_set_state(struct drbd_device *device, union drbd_state ns, atomic_inc(&device->local_cnt); did_remote = drbd_should_do_remote(device->state); + if (!is_sync_state(os.conn) && is_sync_state(ns.conn)) + clear_bit(RS_DONE, &device->flags); + device->state.i = ns.i; should_do_remote = drbd_should_do_remote(device->state); device->resource->susp = ns.susp; |