diff options
author | David Teigland <teigland@redhat.com> | 2012-04-26 22:54:29 +0200 |
---|---|---|
committer | David Teigland <teigland@redhat.com> | 2012-05-02 21:15:27 +0200 |
commit | 4875647a08e35f77274838d97ca8fa44158d50e2 (patch) | |
tree | bf8a39eaf3219af5d661ed3e347545306fd84bda /fs/dlm/rcom.c | |
parent | dlm: improve error and debug messages (diff) | |
download | linux-4875647a08e35f77274838d97ca8fa44158d50e2.tar.xz linux-4875647a08e35f77274838d97ca8fa44158d50e2.zip |
dlm: fixes for nodir mode
The "nodir" mode (statically assign master nodes instead
of using the resource directory) has always been highly
experimental, and never seriously used. This commit
fixes a number of problems, making nodir much more usable.
- Major change to recovery: recover all locks and restart
all in-progress operations after recovery. In some
cases it's not possible to know which in-progess locks
to recover, so recover all. (Most require recovery
in nodir mode anyway since rehashing changes most
master nodes.)
- Change the way nodir mode is enabled, from a command
line mount arg passed through gfs2, into a sysfs
file managed by dlm_controld, consistent with the
other config settings.
- Allow recovering MSTCPY locks on an rsb that has not
yet been turned into a master copy.
- Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages
from a previous, aborted recovery cycle. Base this
on the local recovery status not being in the state
where any nodes should be sending LOCK messages for the
current recovery cycle.
- Hold rsb lock around dlm_purge_mstcpy_locks() because it
may run concurrently with dlm_recover_master_copy().
- Maintain highbast on process-copy lkb's (in addition to
the master as is usual), because the lkb can switch
back and forth between being a master and being a
process copy as the master node changes in recovery.
- When recovering MSTCPY locks, flag rsb's that have
non-empty convert or waiting queues for granting
at the end of recovery. (Rename flag from LOCKS_PURGED
to RECOVER_GRANT and similar for the recovery function,
because it's not only resources with purged locks
that need grant a grant attempt.)
- Replace a couple of unnecessary assertion panics with
error messages.
Signed-off-by: David Teigland <teigland@redhat.com>
Diffstat (limited to 'fs/dlm/rcom.c')
-rw-r--r-- | fs/dlm/rcom.c | 23 |
1 files changed, 17 insertions, 6 deletions
diff --git a/fs/dlm/rcom.c b/fs/dlm/rcom.c index 6565fd5e28ef..64d3e2b958c7 100644 --- a/fs/dlm/rcom.c +++ b/fs/dlm/rcom.c @@ -492,30 +492,41 @@ int dlm_send_ls_not_ready(int nodeid, struct dlm_rcom *rc_in) void dlm_receive_rcom(struct dlm_ls *ls, struct dlm_rcom *rc, int nodeid) { int lock_size = sizeof(struct dlm_rcom) + sizeof(struct rcom_lock); - int stop, reply = 0; + int stop, reply = 0, lock = 0; + uint32_t status; uint64_t seq; switch (rc->rc_type) { + case DLM_RCOM_LOCK: + lock = 1; + break; + case DLM_RCOM_LOCK_REPLY: + lock = 1; + reply = 1; + break; case DLM_RCOM_STATUS_REPLY: case DLM_RCOM_NAMES_REPLY: case DLM_RCOM_LOOKUP_REPLY: - case DLM_RCOM_LOCK_REPLY: reply = 1; }; spin_lock(&ls->ls_recover_lock); + status = ls->ls_recover_status; stop = test_bit(LSFL_RECOVERY_STOP, &ls->ls_flags); seq = ls->ls_recover_seq; spin_unlock(&ls->ls_recover_lock); if ((stop && (rc->rc_type != DLM_RCOM_STATUS)) || - (reply && (rc->rc_seq_reply != seq))) { + (reply && (rc->rc_seq_reply != seq)) || + (lock && !(status & DLM_RS_DIR))) { log_limit(ls, "dlm_receive_rcom ignore msg %d " - "from %d %llu %llu seq %llu", - rc->rc_type, nodeid, + "from %d %llu %llu recover seq %llu sts %x gen %u", + rc->rc_type, + nodeid, (unsigned long long)rc->rc_seq, (unsigned long long)rc->rc_seq_reply, - (unsigned long long)seq); + (unsigned long long)seq, + status, ls->ls_generation); goto out; } |