diff options
author | Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> | 2024-07-30 14:12:21 +0200 |
---|---|---|
committer | Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> | 2024-11-04 10:29:52 +0100 |
commit | 07ad253044f5cf7b9cc5883f0d0a1cdb9ec42821 (patch) | |
tree | 9d61b6f6ddc660b397fa72e024252d9733386422 /mdmon.h | |
parent | monitor: Add DS_EXTERNAL_BB flag (diff) | |
download | mdadm-07ad253044f5cf7b9cc5883f0d0a1cdb9ec42821.tar.xz mdadm-07ad253044f5cf7b9cc5883f0d0a1cdb9ec42821.zip |
mdmon: delegate removal to managemon
Starting from [1], kernel requires suspend lock on member drive remove
path. It causes deadlock with external management because monitor
thread may be locked on suspend and is unable to switch array to active,
for example if badblock is reported in this time.
It is blocking action now, so it must be delegated to managemon thread
but we must ensure that monitor does metadata update first, just after
detecting faulty.
This patch adds appropriative support. Monitor thread detects "faulty",
and updates the metadata. After that, it is asking manager thread to
remove the device. Manager must be careful because closing descriptors
used by select() may lead to abort with D_FORTIFY_SOURCE=2. First, it
must ensure that device descriptors are not used by monitor.
There is unlimited numer of remove retries and recovery is blocked
until all failed drives are removed. It is safe because "faulty"
device is not longer used by MD.
Issue will be also mitigated by optimalization on badlbock recording path
in kernel. It will check if device is not failed before badblock is
recorded but relying on this is not ideologically correct. Userspace
must keep compatibility with kernel and since it is blocking action,
we must tract is as blocking action.
[1] kernel commit cfa078c8b80d ("md: use new apis to suspend array
for adding/removing rdev from state_store()")
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Diffstat (limited to 'mdmon.h')
-rw-r--r-- | mdmon.h | 10 |
1 files changed, 8 insertions, 2 deletions
@@ -48,8 +48,14 @@ struct active_array { enum array_state prev_state, curr_state, next_state; enum sync_action prev_action, curr_action, next_action; - int check_degraded; /* flag set by mon, read by manage */ - int check_reshape; /* flag set by mon, read by manage */ + bool check_degraded : 1; /* flag set by mon, read by manage */ + bool check_reshape : 1; /* flag set by mon, read by manage */ + + /** + * Signalize managemon there is a mdi to be removed. + * Monitor must acknowledge faulty state first. + */ + bool check_member_remove : 1; }; /* |