summaryrefslogtreecommitdiffstats
path: root/fs/super.c
diff options
context:
space:
mode:
authorTejun Heo <tj@kernel.org>2010-07-21 00:18:07 +0200
committerAl Viro <viro@zeniv.linux.org.uk>2010-08-09 22:48:59 +0200
commit4f331f01b9c43bf001d3ffee578a97a1e0633eac (patch)
tree77cd690ab7af2624e3fd7932563f6dc0f5d6441a /fs/super.c
parentsysv: do not mark superblock dirty on remount (diff)
downloadlinux-4f331f01b9c43bf001d3ffee578a97a1e0633eac.tar.xz
linux-4f331f01b9c43bf001d3ffee578a97a1e0633eac.zip
vfs: don't hold s_umount over close_bdev_exclusive() call
Fix an obscure AB-BA deadlock in get_sb_bdev(). When a superblock is mounted more than once get_sb_bdev() calls close_bdev_exclusive() to drop the extra bdev reference while holding s_umount. However, sb->s_umount nests inside bd_mutex during __invalidate_device() and close_bdev_exclusive() acquires bd_mutex during blkdev_put(); thus creating an AB-BA deadlock. This condition doesn't trigger frequently. For this condition to be visible to lockdep, the filesystem must occupy the whole device (as __invalidate_device() only grabs bd_mutex for the whole device), the FS must be mounted more than once and partition rescan should be issued while the FS is still mounted. Fix it by dropping s_umount over close_bdev_exclusive(). Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ciprian Docan <docan@eden.rutgers.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Diffstat (limited to 'fs/super.c')
-rw-r--r--fs/super.c9
1 files changed, 9 insertions, 0 deletions
diff --git a/fs/super.c b/fs/super.c
index 938119ab8dcb..3479ca6f005f 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -773,7 +773,16 @@ int get_sb_bdev(struct file_system_type *fs_type,
goto error_bdev;
}
+ /*
+ * s_umount nests inside bd_mutex during
+ * __invalidate_device(). close_bdev_exclusive()
+ * acquires bd_mutex and can't be called under
+ * s_umount. Drop s_umount temporarily. This is safe
+ * as we're holding an active reference.
+ */
+ up_write(&s->s_umount);
close_bdev_exclusive(bdev, mode);
+ down_write(&s->s_umount);
} else {
char b[BDEVNAME_SIZE];