| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull block updates from Jens Axboe:
"Nothing major in here - lots of good cleanups and tech debt handling,
which is also evident in the diffstats. In particular:
- Add disk sequence numbers (Matteo)
- Discard merge fix (Ming)
- Relax disk zoned reporting restrictions (Niklas)
- Bio error handling zoned leak fix (Pavel)
- Start of proper add_disk() error handling (Luis, Christoph)
- blk crypto fix (Eric)
- Non-standard GPT location support (Dmitry)
- IO priority improvements and cleanups (Damien)o
- blk-throtl improvements (Chunguang)
- diskstats_show() stack reduction (Abd-Alrhman)
- Loop scheduler selection (Bart)
- Switch block layer to use kmap_local_page() (Christoph)
- Remove obsolete disk_name helper (Christoph)
- block_device refcounting improvements (Christoph)
- Ensure gendisk always has a request queue reference (Christoph)
- Misc fixes/cleanups (Shaokun, Oliver, Guoqing)"
* tag 'for-5.15/block-2021-08-30' of git://git.kernel.dk/linux-block: (129 commits)
sg: pass the device name to blk_trace_setup
block, bfq: cleanup the repeated declaration
blk-crypto: fix check for too-large dun_bytes
blk-zoned: allow BLKREPORTZONE without CAP_SYS_ADMIN
blk-zoned: allow zone management send operations without CAP_SYS_ADMIN
block: mark blkdev_fsync static
block: refine the disk_live check in del_gendisk
mmc: sdhci-tegra: Enable MMC_CAP2_ALT_GPT_TEGRA
mmc: block: Support alternative_gpt_sector() operation
partitions/efi: Support non-standard GPT location
block: Add alternative_gpt_sector() operation
bio: fix page leak bio_add_hw_page failure
block: remove CONFIG_DEBUG_BLOCK_EXT_DEVT
block: remove a pointless call to MINOR() in device_add_disk
null_blk: add error handling support for add_disk()
virtio_blk: add error handling support for add_disk()
block: add error handling for device_add_disk / add_disk
block: return errors from disk_alloc_events
block: return errors from blk_integrity_add
block: call blk_register_queue earlier in device_add_disk
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fix a regression that passed a NULL device name to blk_trace_setup
accidentally.
Fixes: aebbb5831fbd ("sg: do not allocate a gendisk")
Reported-by: syzbot+f74aa89114a236643919@syzkaller.appspotmail.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Link: https://lore.kernel.org/r/20210825075438.1883687-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Function 'bfq_entity_to_bfqq' is declared twice, so remove the
repeated declaration and blank line.
Cc: Paolo Valente <paolo.valente@linaro.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1629872391-46399-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
dun_bytes needs to be less than or equal to the IV size of the
encryption mode, not just less than or equal to BLK_CRYPTO_MAX_IV_SIZE.
Currently this doesn't matter since blk_crypto_init_key() is never
actually passed invalid values, but we might as well fix this.
Fixes: a892c8d52c02 ("block: Inline encryption support for blk-mq")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20210825055918.51975-1-ebiggers@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
A user space process should not need the CAP_SYS_ADMIN capability set
in order to perform a BLKREPORTZONE ioctl.
Getting the zone report is required in order to get the write pointer.
Neither read() nor write() requires CAP_SYS_ADMIN, so it is reasonable
that a user space process that can read/write from/to the device, also
can get the write pointer. (Since e.g. writes have to be at the write
pointer.)
Fixes: 3ed05a987e0f ("blk-zoned: implement ioctls")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Aravind Ramesh <aravind.ramesh@wdc.com>
Reviewed-by: Adam Manzanares <a.manzanares@samsung.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: stable@vger.kernel.org # v4.10+
Link: https://lore.kernel.org/r/20210811110505.29649-3-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Zone management send operations (BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE
and BLKFINISHZONE) should be allowed under the same permissions as write().
(write() does not require CAP_SYS_ADMIN).
Additionally, other ioctls like BLKSECDISCARD and BLKZEROOUT only check if
the fd was successfully opened with FMODE_WRITE.
(They do not require CAP_SYS_ADMIN).
Currently, zone management send operations require both CAP_SYS_ADMIN
and that the fd was successfully opened with FMODE_WRITE.
Remove the CAP_SYS_ADMIN requirement, so that zone management send
operations match the access control requirement of write(), BLKSECDISCARD
and BLKZEROOUT.
Fixes: 3ed05a987e0f ("blk-zoned: implement ioctls")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Aravind Ramesh <aravind.ramesh@wdc.com>
Reviewed-by: Adam Manzanares <a.manzanares@samsung.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: stable@vger.kernel.org # v4.10+
Link: https://lore.kernel.org/r/20210811110505.29649-2-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
blkdev_fsync is only used inside of block_dev.c since the
removal of the raw drŅver.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210824151823.1575100-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
hidden gendisks will never be marked live.
Fixes: 40b3a52ffc5b ("block: add a sanity check for a live disk in del_gendisk")
Reported-by: Bruno Goncalves <bgoncalv@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210824144310.1487816-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Tegra20/30/114/124 Android devices place GPT at a non-standard location.
Enable GPT entry scanning at that location.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/20210820004536.15791-5-digetx@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Support generic alternative_gpt_sector() block device operation.
It calculates location of GPT entry for eMMC of NVIDIA Tegra Android
devices. Add new MMC_CAP2_ALT_GPT_TEGRA flag that enables scanning of
alternative GPT sector and add raw_boot_mult field to mmc_ext_csd
which allows to get size of the boot partitions that is needed for
the calculation.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/20210820004536.15791-4-digetx@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Support looking up GPT at a non-standard location specified by a block
device driver.
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/20210820004536.15791-3-digetx@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add alternative_gpt_sector() block device operation which specifies
alternative location of a GPT entry. This allows us to support Android
devices that have GPT entry at a non-standard location and can't be
repartitioned easily.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/20210820004536.15791-2-digetx@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
__bio_iov_append_get_pages() doesn't put not appended pages on
bio_add_hw_page() failure, so potentially leaking them, fix it. Also, do
the same for __bio_iov_iter_get_pages(), even though it looks like it
can't be triggered by userspace in this case.
Fixes: 0512a75b98f8 ("block: Introduce REQ_OP_ZONE_APPEND")
Cc: stable@vger.kernel.org # 5.8+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1edfa6a2ffd66d55e6345a477df5387d2c1415d0.1626653825.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This might have been a neat debug aid when the extended dev_t was
added, but that time is long gone.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210824075216.1179406-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
blk_alloc_ext_minor already returns just a minor number, so no need to
mask the high bits.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210824075216.1179406-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We never checked for errors on add_disk() as this function
returned void. Now that this is fixed, use the shiny new
error handling. The actual cleanup in case of error is
already handled by the caller of null_gendisk_register().
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210818144542.19305-12-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We never checked for errors on add_disk() as this function
returned void. Now that this is fixed, use the shiny new
error handling.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210818144542.19305-11-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Properly unwind on errors in device_add_disk. This is the initial work
as drivers are not converted yet, which will follow in separate patches.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
[hch: major rebase. All bugs are probably mine]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210818144542.19305-10-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Prepare for proper error handling in add_disk.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210818144542.19305-9-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Prepare for proper error handling in add_disk.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210818144542.19305-8-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Ensure that all the sysfs bits are set up before bdev_add is called,
as that will make the upcomding error handling much easier. However
this means the call to disk_update_readahead has to be split as that
requires a bdi. Also remove various sanity checks that don't make
sense now that blk_register_queue only has a single caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210818144542.19305-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Doing all the sysfs file creation before adding the bdev and thus
allowing it to be opened will simplify the about to be added error
handling.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210818144542.19305-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This will simplify error handling going forward.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210818144542.19305-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Once bdev_add is called userspace can open the block device. Ensure
that the struct device, which is used for refcounting of the disk
besides various other things, is fully setup at that point.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210818144542.19305-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There is no real reason these should be separate. Also simplify the
groups assignment a bit.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210818144542.19305-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add a sanity check to del_gendisk to do nothing when the disk wasn't
successfully added. This papers over the complete lack of add_disk
error handling, which is about to get fixed gradually.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210818144542.19305-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Replace the magic lookup through the kobject tree with an explicit
backpointer, given that the device model links are set up and torn
down at times when I/O is still possible, leading to potential
NULL or invalid pointer dereferences.
Fixes: edb0872f44ec ("block: move the bdi from the request_queue to the gendisk")
Reported-by: syzbot <syzbot+aa0801b6b32dca9dda82@syzkaller.appspotmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Sven Schnelle <svens@linux.ibm.com>
Link: https://lore.kernel.org/r/20210816134624.GA24234@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Acquire the queue ref dropped in disk_release in __blk_alloc_disk so any
allocate gendisk always has a queue reference.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210816131910.615153-9-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pass in a request_queue and assign disk->queue in __blk_alloc_disk to
ensure struct gendisk always has a valid ->queue pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210816131910.615153-8-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This was a leftover from the legacy alloc_disk interface. Switch
the scsi ULPs and dasd to set ->minors directly like all other
drivers and remove the argument.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefan Haberland <sth@linux.ibm.com> [dasd]
Link: https://lore.kernel.org/r/20210816131910.615153-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Most drivers should use and have been converted to use blk_alloc_disk
and blk_mq_alloc_disk. Only the scsi ULPs and dasd still allocate
a disk separately from the request_queue, so don't bother with
convenience macros for something that should not see significant
new users and remove these wrappers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210816131910.615153-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pass the lockdep name to the low-level __blk_alloc_disk helper and
hardcode the name for it given that the number of minors or node_id
are not very useful information. While this passes a pointless
argument for non-lockdep builds that is not really an issue as
disk allocation is a probe time only slow path.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210816131910.615153-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
sg is a character driver and thus does not need to allocate a gendisk,
which is only used for file system-like block layer I/O on block
devices.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20210816131910.615153-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
st is a character driver and thus does not need to allocate a gendisk,
which is only used for file system-like block layer I/O on block
devices.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210816131910.615153-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Switch to use the blk_mq_alloc_disk helper for allocating the
request_queue and gendisk.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Link: https://lore.kernel.org/r/20210816131910.615153-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This essentially reverts "block: remove the extra kobject reference in
bd_link_disk_holder". That commit dropped the extra reference because
the condition in the comment can't be true. But it turns out that
comment did not actually describe the problematic situation, so add
back the extra reference and document it properly.
Fixes: fbd9a39542ec ("block: remove the extra kobject reference in bd_link_disk_holder")
Reported-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The default IO priority is the best effort (BE) class with the
normal priority level IOPRIO_NORM (4). However, get_task_ioprio()
returns IOPRIO_CLASS_NONE/IOPRIO_NORM as the default priority and
get_current_ioprio() returns IOPRIO_CLASS_NONE/0. Let's be consistent
with the defined default and have both of these functions return the
default priority IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_NORM) when
the user did not define another default IO priority for the task.
In include/uapi/linux/ioprio.h, introduce the IOPRIO_BE_NORM macro as
an alias to IOPRIO_NORM to clarify that this default level applies to
the BE priotity class. In include/linux/ioprio.h, define the macro
IOPRIO_DEFAULT as IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_BE_NORM)
and use this new macro when setting a priority to the default.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Link: https://lore.kernel.org/r/20210811033702.368488-7-damien.lemoal@wdc.com
[axboe: drop unnecessary lightnvm change]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The BFQ scheduler and ioprio_check_cap() both assume that the RT
priority class (IOPRIO_CLASS_RT) can have up to 8 different priority
levels, similarly to the BE class (IOPRIO_CLASS_iBE). This is
controlled using the IOPRIO_BE_NR macro , which is badly named as the
number of levels also applies to the RT class.
Introduce the class independent IOPRIO_NR_LEVELS macro, defined to 8,
to make things clear. Keep the old IOPRIO_BE_NR macro definition as an
alias for IOPRIO_NR_LEVELS.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Link: https://lore.kernel.org/r/20210811033702.368488-6-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The ki_ioprio field of struct kiocb is 16-bits (u16) but often handled
as an int in the block layer. E.g. ioprio_check_cap() takes an int as
argument.
With such implicit int casting function calls, the upper 16-bits of the
int argument may be left uninitialized by the compiler, resulting in
invalid values for the IOPRIO_PRIO_CLASS() macro (garbage upper bits)
and in an error return for functions such as ioprio_check_cap().
Fix this by masking the result of the shift by IOPRIO_CLASS_SHIFT bits
in the IOPRIO_PRIO_CLASS() macro. The new macro IOPRIO_CLASS_MASK
defines the 3-bits mask for the priority class.
Similarly, apply the IOPRIO_PRIO_MASK mask to the data argument of the
IOPRIO_PRIO_VALUE() macro to ignore the upper bits of the data value.
The IOPRIO_CLASS_MASK mask is also applied to the class argument of this
macro before shifting the result by IOPRIO_CLASS_SHIFT bits.
While at it, also change the argument name of the IOPRIO_PRIO_CLASS()
and IOPRIO_PRIO_DATA() macros from "mask" to "ioprio" to reflect the
fact that a priority value should be passed rather than a mask.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Link: https://lore.kernel.org/r/20210811033702.368488-5-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Change the ioprio_valid() macro in include/usapi/linux/ioprio.h to an
inline function declared on the kernel side in include/linux/ioprio.h.
Also improve checks on the class value by checking the upper bound
value.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Link: https://lore.kernel.org/r/20210811033702.368488-4-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In include/usapi/linux/ioprio.h, change the ioprio class enum comment
to remove the outdated reference to CFQ and mention BFQ and mq-deadline
instead. Also document the high priority NCQ command use for RT class
IOs directed at ATA drives that support NCQ priority.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Link: https://lore.kernel.org/r/20210811033702.368488-3-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For a request that has a priority level equal to or larger than
IOPRIO_BE_NR, bfq_set_next_ioprio_data() prints a critical warning but
defaults to setting the request new_ioprio field to IOPRIO_BE_NR. This
is not consistent with the warning and the allowed values for priority
levels. Fix this by setting the request new_ioprio field to
IOPRIO_BE_NR - 1, the lowest priority level allowed.
Cc: <stable@vger.kernel.org>
Fixes: aee69d78dec0 ("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler")
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210811033702.368488-2-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Not actually used in any modular code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210816123649.601591-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
seq_get_buf is a crutch that undoes all the memory safety of the
seq_file interface. Use the normal seq_printf interfaces instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20210810152623.1796144-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Factor out a helper to deal with a single blkcg_gq to make the code a
little bit easier to follow.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20210810152623.1796144-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Use bvec_virt instead of open coding it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20210804095634.460779-16-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Use bvec_virt instead of open coding it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210804095634.460779-15-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Use bvec_virt instead of open coding it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>
Link: https://lore.kernel.org/r/20210804095634.460779-14-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Use bvec_virt instead of open coding it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210804095634.460779-13-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Use bvec_virt instead of open coding it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Link: https://lore.kernel.org/r/20210804095634.460779-12-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|