summaryrefslogtreecommitdiffstats
path: root/block/blk-mq-sched.c
diff options
context:
space:
mode:
authorJinlong Chen <nickyc975@zju.edu.cn>2022-10-20 08:48:19 +0200
committerJens Axboe <axboe@kernel.dk>2022-10-24 02:59:17 +0200
commit8ed40ee35d94df2fdb56bbbc07e17dffd2383625 (patch)
treea75d555f4c1dad669ad055e21136ae6e79ccfe97 /block/blk-mq-sched.c
parentblock: check for an unchanged elevator earlier in __elevator_change (diff)
downloadlinux-8ed40ee35d94df2fdb56bbbc07e17dffd2383625.tar.xz
linux-8ed40ee35d94df2fdb56bbbc07e17dffd2383625.zip
block: fix up elevator_type refcounting
The current reference management logic of io scheduler modules contains refcnt problems. For example, blk_mq_init_sched may fail before or after the calling of e->ops.init_sched. If it fails before the calling, it does nothing to the reference to the io scheduler module. But if it fails after the calling, it releases the reference by calling kobject_put(&eq->kobj). As the callers of blk_mq_init_sched can't know exactly where the failure happens, they can't handle the reference to the io scheduler module properly: releasing the reference on failure results in double-release if blk_mq_init_sched has released it, and not releasing the reference results in ghost reference if blk_mq_init_sched did not release it either. The same problem also exists in io schedulers' init_sched implementations. We can address the problem by adding releasing statements to the error handling procedures of blk_mq_init_sched and init_sched implementations. But that is counterintuitive and requires modifications to existing io schedulers. Instead, We make elevator_alloc get the io scheduler module references that will be released by elevator_release. And then, we match each elevator_get with an elevator_put. Therefore, each reference to an io scheduler module explicitly has its own getter and releaser, and we no longer need to worry about the refcnt problems. The bugs and the patch can be validated with tools here: https://github.com/nickyc975/linux_elv_refcnt_bug.git [hch: split out a few bits into separate patches, use a non-try module_get in elevator_alloc] Signed-off-by: Jinlong Chen <nickyc975@zju.edu.cn> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221020064819.1469928-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
Diffstat (limited to 'block/blk-mq-sched.c')
-rw-r--r--block/blk-mq-sched.c1
1 files changed, 1 insertions, 0 deletions
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index a4f7c101b53b..68227240fdea 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -555,6 +555,7 @@ static int blk_mq_init_sched_shared_tags(struct request_queue *queue)
return 0;
}
+/* caller must have a reference to @e, will grab another one if successful */
int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
{
unsigned int flags = q->tag_set->flags;