summaryrefslogtreecommitdiffstats
path: root/samples (unfollow)
Commit message (Collapse)AuthorFilesLines
2023-06-01blk-ioc: protect ioc_destroy_icq() by 'queue_lock'Yu Kuai1-17/+13
Currently, icq is tracked by both request_queue(icq->q_node) and task(icq->ioc_node), and ioc_clear_queue() from elevator exit is not safe because it can access the list without protection: ioc_clear_queue ioc_release_fn lock queue_lock list_splice /* move queue list to a local list */ unlock queue_lock /* * lock is released, the local list * can be accessed through task exit. */ lock ioc->lock while (!hlist_empty) icq = hlist_entry lock queue_lock ioc_destroy_icq delete icq->ioc_node while (!list_empty) icq = list_entry() list_del icq->q_node /* * This is not protected by any lock, * list_entry concurrent with list_del * is not safe. */ unlock queue_lock unlock ioc->lock Fix this problem by protecting list 'icq->q_node' by queue_lock from ioc_clear_queue(). Reported-and-tested-by: Pradeep Pragallapati <quic_pragalla@quicinc.com> Link: https://lore.kernel.org/lkml/20230517084434.18932-1-quic_pragalla@quicinc.com/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230531073435.2923422-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-01block: mark bio_add_folio as __must_checkJohannes Thumshirn1-1/+2
Now that all callers of bio_add_folio() check the return value, mark it as __must_check. Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/381360a45ac3684120cfbe1e07685e9c36256e47.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-01fs: iomap: use bio_add_folio_nofail where possibleJohannes Thumshirn1-3/+3
When the iomap buffered-io code can't add a folio to a bio, it allocates a new bio and adds the folio to that one. This is done using bio_add_folio(), but doesn't check for errors. As adding a folio to a newly created bio can't fail, use the newly introduced bio_add_folio_nofail() function. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/58fa893c24c67340a63323f09a179fefdca07f2a.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-01block: add bio_add_folio_nofailJohannes Thumshirn2-0/+10
Just like for bio_add_pages() add a no-fail variant for bio_add_folio(). Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/924dff4077812804398ef84128fb920507fa4be1.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-01block: mark bio_add_page as __must_checkJohannes Thumshirn1-1/+2
Now that all users of bio_add_page check for the return value, mark bio_add_page as __must_check. Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/7ae4a902e08fe2e90c012ee07aeb35d4aae28373.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-01dm-crypt: use __bio_add_page to add single page to clone bioJohannes Thumshirn1-2/+1
crypt_alloc_buffer() already allocates enough entries in the clone bio's vector, so adding a page to the bio can't fail. Use __bio_add_page() to reflect this. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/f9a4dee5e81389fd70ffc442da01006538e55aca.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-01md: raid1: check if adding pages to resync bio failsJohannes Thumshirn2-15/+16
Check if adding pages to resync bio fails and if bail out. As the comment above suggests this cannot happen, WARN if it actually happens. Technically __bio_add_pages() would be sufficient here, but asserting the pages actually get added to the bio is preferred. This way we can mark bio_add_pages as __must_check. Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/33aea4c271220dc9bcab58c4b7bec478c1511142.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-01md: raid1: use __bio_add_page for adding single page to bioJohannes Thumshirn1-1/+1
The sync request code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/6cf7f66c6e646231200d025dfd5f2d3ae75c8fe5.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-01md: check for failure when adding pages in alloc_behind_master_bioJohannes Thumshirn1-1/+4
alloc_behind_master_bio() can possibly add multiple pages to a bio, but it is not checking for the return value of bio_add_page() if adding really succeeded. Check if the page adding succeeded and if not bail out. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Song Liu <song@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/827aa12d44ebf3f50b41b47f5cedc0f80179f2c1.1685532726.git.johannes.thumshirn@wdc.com [axboe: fold in s/free_page/put_page fix] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31floppy: use __bio_add_page for adding single page to bioJohannes Thumshirn1-1/+1
The floppy code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/33c445a3b431270c72d9be03d5da1b08ae983920.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31zram: use __bio_add_page for adding single page to bioJohannes Thumshirn1-1/+1
The zram writeback code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/cfd141dd7773315879a126f2aa81b7f698bc0e10.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31zonefs: use __bio_add_page for adding single page to bioJohannes Thumshirn1-1/+1
The zonefs superblock reading code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/04c9978ccaa0fc9871cd4248356638d98daccf0c.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31gfs2: use __bio_add_page for adding single page to bioJohannes Thumshirn1-1/+1
The GFS2 superblock reading code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/087c67d4e4973f949d3519c1e4822784ce583c5a.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31jfs: logmgr: use __bio_add_page to add single page to bioJohannes Thumshirn1-2/+2
The JFS IO code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/9fb5ed86d19f6e0b6f64dfc4109a48ff8ff24497.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31md: raid5: use __bio_add_page to add single page to new bioJohannes Thumshirn1-2/+2
The raid5-ppl submission code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. For adding consecutive pages, the return is actually checked and a new bio is allocated if adding the page fails. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/27e6bcd762354bff74602e89159cdd12ae3d1fa9.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31md: raid5-log: use __bio_add_page to add single pageJohannes Thumshirn1-1/+1
The raid5 log metadata submission code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/832a810d6c9e71f88b0a39cb076a8c70e8bcb821.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31md: use __bio_add_page to add single pageJohannes Thumshirn1-2/+2
The md-raid superblock writing code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Signed-of_-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/ca196f5e650e318106dbb4496eb6cbac4bc800bd.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31fs: buffer: use __bio_add_page to add single page to bioJohannes Thumshirn1-2/+1
The buffer_head submission code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Reviewed-by: Gou Hao <gouhao@uniontech.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/84ff2dcbe81b258a73ad900adb5266e208b61a4d.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31dm: dm-zoned: use __bio_add_page for adding single metadata pageJohannes Thumshirn1-3/+3
dm-zoned uses bio_add_page() for adding a single page to a freshly created metadata bio. Use __bio_add_page() instead as adding a single page to a new bio is always guaranteed to succeed. This brings us a step closer to marking bio_add_page() __must_check Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/55a0c8dad7550379647873b579dc7cfbe0191f96.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31drbd: use __bio_add_page to add page to bioJohannes Thumshirn1-3/+1
The drbd code only adds a single page to a newly created bio. So use __bio_add_page() to add the page which is guaranteed to succeed in this case. This brings us closer to marking bio_add_page() as __must_check. Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/435007afac14f3766455559059d21843771fae53.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31swap: use __bio_add_page to add page to bioJohannes Thumshirn1-4/+4
The swap code only adds a single page to a newly created bio. So use __bio_add_page() to add the page which is guaranteed to succeed in this case. This brings us closer to marking bio_add_page() as __must_check. Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/5bdafd9de806b2dab92302b30eb7a3a5f10c37d9.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31block: Use iov_iter_extract_pages() and page pinning in direct-io.cDavid Howells1-29/+43
Change the old block-based direct-I/O code to use iov_iter_extract_pages() to pin user pages or leave kernel pages unpinned rather than taking refs when submitting bios. This makes use of the preceding patches to not take pins on the zero page (thereby allowing insertion of zero pages in with pinned pages) and to get additional pins on pages, allowing an extracted page to be used in multiple bios without having to re-extract it. Signed-off-by: David Howells <dhowells@redhat.com> cc: Christoph Hellwig <hch@infradead.org> cc: David Hildenbrand <david@redhat.com> cc: Lorenzo Stoakes <lstoakes@gmail.com> cc: Andrew Morton <akpm@linux-foundation.org> cc: Jens Axboe <axboe@kernel.dk> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Matthew Wilcox <willy@infradead.org> cc: Jan Kara <jack@suse.cz> cc: Jeff Layton <jlayton@kernel.org> cc: Jason Gunthorpe <jgg@nvidia.com> cc: Logan Gunthorpe <logang@deltatee.com> cc: Hillf Danton <hdanton@sina.com> cc: Christian Brauner <brauner@kernel.org> cc: Linus Torvalds <torvalds@linux-foundation.org> cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-kernel@vger.kernel.org cc: linux-mm@kvack.org Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230526214142.958751-4-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31mm: Provide a function to get an additional pin on a pageDavid Howells2-0/+28
Provide a function to get an additional pin on a page that we already have a pin on. This will be used in fs/direct-io.c when dispatching multiple bios to a page we've extracted from a user-backed iter rather than redoing the extraction. Signed-off-by: David Howells <dhowells@redhat.com> cc: Christoph Hellwig <hch@infradead.org> cc: David Hildenbrand <david@redhat.com> cc: Lorenzo Stoakes <lstoakes@gmail.com> cc: Andrew Morton <akpm@linux-foundation.org> cc: Jens Axboe <axboe@kernel.dk> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Matthew Wilcox <willy@infradead.org> cc: Jan Kara <jack@suse.cz> cc: Jeff Layton <jlayton@kernel.org> cc: Jason Gunthorpe <jgg@nvidia.com> cc: Logan Gunthorpe <logang@deltatee.com> cc: Hillf Danton <hdanton@sina.com> cc: Christian Brauner <brauner@kernel.org> cc: Linus Torvalds <torvalds@linux-foundation.org> cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-kernel@vger.kernel.org cc: linux-mm@kvack.org Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20230526214142.958751-3-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-31mm: Don't pin ZERO_PAGE in pin_user_pages()David Howells3-3/+60
Make pin_user_pages*() leave a ZERO_PAGE unpinned if it extracts a pointer to it from the page tables and make unpin_user_page*() correspondingly ignore a ZERO_PAGE when unpinning. We don't want to risk overrunning a zero page's refcount as we're only allowed ~2 million pins on it - something that userspace can conceivably trigger. Add a pair of functions to test whether a page or a folio is a ZERO_PAGE. Signed-off-by: David Howells <dhowells@redhat.com> cc: Christoph Hellwig <hch@infradead.org> cc: David Hildenbrand <david@redhat.com> cc: Lorenzo Stoakes <lstoakes@gmail.com> cc: Andrew Morton <akpm@linux-foundation.org> cc: Jens Axboe <axboe@kernel.dk> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Matthew Wilcox <willy@infradead.org> cc: Jan Kara <jack@suse.cz> cc: Jeff Layton <jlayton@kernel.org> cc: Jason Gunthorpe <jgg@nvidia.com> cc: Logan Gunthorpe <logang@deltatee.com> cc: Hillf Danton <hdanton@sina.com> cc: Christian Brauner <brauner@kernel.org> cc: Linus Torvalds <torvalds@linux-foundation.org> cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-kernel@vger.kernel.org cc: linux-mm@kvack.org Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20230526214142.958751-2-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-30block: constify the whole_disk device_attributeThomas Weißschuh1-1/+1
The struct is never modified so it can be const. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20230419-const-partition-v3-4-4e14e48be367@weissschuh.net Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-30block: constify struct part_attr_groupThomas Weißschuh1-1/+1
The struct is never modified so it can be const. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20230419-const-partition-v3-3-4e14e48be367@weissschuh.net Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-30block: constify struct part_type part_typeThomas Weißschuh2-2/+2
The struct is never modified so it can be const. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20230419-const-partition-v3-2-4e14e48be367@weissschuh.net Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-30block: constify partition prober arrayThomas Weißschuh1-1/+1
The array is never modified so it can be const. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202304191640.SkNk7kVN-lkp@intel.com/ Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20230419-const-partition-v3-1-4e14e48be367@weissschuh.net Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24block: convert bio_map_user_iov to use iov_iter_extract_pagesDavid Howells1-12/+11
This will pin pages or leave them unaltered rather than getting a ref on them as appropriate to the iterator. The pages need to be pinned for DIO rather than having refs taken on them to prevent VM copy-on-write from malfunctioning during a concurrent fork() (the result of the I/O could otherwise end up being visible to/affected by the child process). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Hubbard <jhubbard@nvidia.com> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Jan Kara <jack@suse.cz> cc: Matthew Wilcox <willy@infradead.org> cc: Logan Gunthorpe <logang@deltatee.com> cc: linux-block@vger.kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230522205744.2825689-7-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24block: Convert bio_iov_iter_get_pages to use iov_iter_extract_pagesDavid Howells1-11/+12
This will pin pages or leave them unaltered rather than getting a ref on them as appropriate to the iterator. The pages need to be pinned for DIO rather than having refs taken on them to prevent VM copy-on-write from malfunctioning during a concurrent fork() (the result of the I/O could otherwise end up being affected by/visible to the child process). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Hubbard <jhubbard@nvidia.com> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Jan Kara <jack@suse.cz> cc: Matthew Wilcox <willy@infradead.org> cc: Logan Gunthorpe <logang@deltatee.com> cc: linux-block@vger.kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230522205744.2825689-6-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24block: Add BIO_PAGE_PINNED and associated infrastructureDavid Howells4-4/+18
Add BIO_PAGE_PINNED to indicate that the pages in a bio are pinned (FOLL_PIN) and that the pin will need removing. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Hubbard <jhubbard@nvidia.com> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Jan Kara <jack@suse.cz> cc: Matthew Wilcox <willy@infradead.org> cc: Logan Gunthorpe <logang@deltatee.com> cc: linux-block@vger.kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230522205744.2825689-5-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24block: Replace BIO_NO_PAGE_REF with BIO_PAGE_REFFED with inverted logicChristoph Hellwig6-4/+6
Replace BIO_NO_PAGE_REF with a BIO_PAGE_REFFED flag that has the inverted meaning is only set when a page reference has been acquired that needs to be released by bio_release_pages(). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Jan Kara <jack@suse.cz> cc: Matthew Wilcox <willy@infradead.org> cc: Logan Gunthorpe <logang@deltatee.com> cc: linux-block@vger.kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230522205744.2825689-4-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24block: Fix bio_flagged() so that gcc can better optimise itDavid Howells1-1/+1
Fix bio_flagged() so that multiple instances of it, such as: if (bio_flagged(bio, BIO_PAGE_REFFED) || bio_flagged(bio, BIO_PAGE_PINNED)) can be combined by the gcc optimiser into a single test in assembly (arguably, this is a compiler optimisation issue[1]). The missed optimisation stems from bio_flagged() comparing the result of the bitwise-AND to zero. This results in an out-of-line bio_release_page() being compiled to something like: <+0>: mov 0x14(%rdi),%eax <+3>: test $0x1,%al <+5>: jne 0xffffffff816dac53 <bio_release_pages+11> <+7>: test $0x2,%al <+9>: je 0xffffffff816dac5c <bio_release_pages+20> <+11>: movzbl %sil,%esi <+15>: jmp 0xffffffff816daba1 <__bio_release_pages> <+20>: jmp 0xffffffff81d0b800 <__x86_return_thunk> However, the test is superfluous as the return type is bool. Removing it results in: <+0>: testb $0x3,0x14(%rdi) <+4>: je 0xffffffff816e4af4 <bio_release_pages+15> <+6>: movzbl %sil,%esi <+10>: jmp 0xffffffff816dab7c <__bio_release_pages> <+15>: jmp 0xffffffff81d0b7c0 <__x86_return_thunk> instead. Also, the MOVZBL instruction looks unnecessary[2] - I think it's just 're-booling' the mark_dirty parameter. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Hubbard <jhubbard@nvidia.com> cc: Jens Axboe <axboe@kernel.dk> cc: linux-block@vger.kernel.org Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108370 [1] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108371 [2] Link: https://lore.kernel.org/r/167391056756.2311931.356007731815807265.stgit@warthog.procyon.org.uk/ # v6 Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230522205744.2825689-3-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24iomap: Don't get an reference on ZERO_PAGE for direct I/O block zeroingDavid Howells1-1/+1
ZERO_PAGE can't go away, no need to hold an extra reference. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230522205744.2825689-2-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24splice: kdoc for filemap_splice_read() and copy_splice_read()David Howells2-5/+37
Provide kerneldoc comments for filemap_splice_read() and copy_splice_read(). Signed-off-by: David Howells <dhowells@redhat.com> cc: Christian Brauner <brauner@kernel.org> cc: Christoph Hellwig <hch@lst.de> cc: Jens Axboe <axboe@kernel.dk> cc: Steve French <smfrench@gmail.com> cc: Al Viro <viro@zeniv.linux.org.uk> cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20230522135018.2742245-32-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24iov_iter: Kill ITER_PIPEDavid Howells3-444/+4
The ITER_PIPE-type iterator was only used by generic_file_splice_read() and that has been replaced and removed. This leaves ITER_PIPE unused - so remove it too. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Al Viro <viro@zeniv.linux.org.uk> cc: David Hildenbrand <david@redhat.com> cc: John Hubbard <jhubbard@nvidia.com> cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20230522135018.2742245-31-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24splice: Remove generic_file_splice_read()David Howells2-45/+0
Remove generic_file_splice_read() as it has been replaced with calls to filemap_splice_read() and copy_splice_read(). With this, ITER_PIPE is no longer used. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Steve French <smfrench@gmail.com> cc: Al Viro <viro@zeniv.linux.org.uk> cc: David Hildenbrand <david@redhat.com> cc: John Hubbard <jhubbard@nvidia.com> cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20230522135018.2742245-30-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24splice: Use filemap_splice_read() instead of generic_file_splice_read()David Howells37-39/+39
Replace pointers to generic_file_splice_read() with calls to filemap_splice_read(). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Al Viro <viro@zeniv.linux.org.uk> cc: David Hildenbrand <david@redhat.com> cc: John Hubbard <jhubbard@nvidia.com> cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20230522135018.2742245-29-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24cifs: Use filemap_splice_read()David Howells3-23/+4
Make cifs use filemap_splice_read() rather than doing its own version of generic_file_splice_read(). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> cc: Jens Axboe <axboe@kernel.dk> cc: Steve French <smfrench@gmail.com> cc: Al Viro <viro@zeniv.linux.org.uk> cc: David Hildenbrand <david@redhat.com> cc: John Hubbard <jhubbard@nvidia.com> cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20230522135018.2742245-28-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24trace: Convert trace/seq to use copy_splice_read()David Howells1-1/+1
For the splice from the trace seq buffer, just use copy_splice_read(). In the future, something better can probably be done by gifting pages from seq->buf into the pipe, but that would require changing seq->buf into a vmap over an array of pages. Signed-off-by: David Howells <dhowells@redhat.com> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Steven Rostedt <rostedt@goodmis.org> cc: Masami Hiramatsu <mhiramat@kernel.org> cc: linux-kernel@vger.kernel.org cc: linux-trace-kernel@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20230522135018.2742245-27-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24zonefs: Provide a splice-read wrapperDavid Howells1-1/+39
Provide a splice_read wrapper for zonefs. This does some checks before proceeding and locks the inode across the call to filemap_splice_read() and a size check in case of truncation. Splicing from direct I/O is handled by the caller. Signed-off-by: David Howells <dhowells@redhat.com> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Darrick J. Wong <djwong@kernel.org> cc: linux-xfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org Acked-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20230522135018.2742245-26-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24xfs: Provide a splice-read wrapperDavid Howells2-2/+30
Provide a splice_read wrapper for XFS. This does a stat count and a shutdown check before proceeding, then emits a new trace line and locks the inode across the call to filemap_splice_read() and adds to the stats afterwards. Splicing from direct I/O or DAX is handled by the caller. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Darrick J. Wong <djwong@kernel.org> cc: linux-xfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20230522135018.2742245-25-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24orangefs: Provide a splice-read wrapperDavid Howells1-1/+21
Provide a splice_read wrapper for ocfs2. This increments the read stats and then locks the inode across the call to filemap_splice_read() and a revalidation of the mapping. Splicing from direct I/O is done by the caller. Signed-off-by: David Howells <dhowells@redhat.com> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Mike Marshall <hubcap@omnibond.com> cc: Martin Brandenburg <martin@omnibond.com> cc: devel@lists.orangefs.org cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20230522135018.2742245-24-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24ocfs2: Provide a splice-read wrapperDavid Howells2-2/+42
Provide a splice_read wrapper for ocfs2. This emits trace lines and does an atime lock/update before calling filemap_splice_read(). Splicing from direct I/O is handled by the caller. A couple of new tracepoints are added for this purpose. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Mark Fasheh <mark@fasheh.com> cc: Joel Becker <jlbec@evilplan.org> cc: ocfs2-devel@oss.oracle.com cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20230522135018.2742245-23-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24ntfs3: Provide a splice-read wrapperDavid Howells1-1/+30
Provide a splice_read wrapper for NTFS3 to perform various checks before allowing the operation to proceed. Signed-off-by: David Howells <dhowells@redhat.com> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> cc: ntfs3@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20230522135018.2742245-22-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24nfs: Provide a splice-read wrapperDavid Howells3-2/+25
Provide a splice_read wrapper for NFS. This locks the inode around filemap_splice_read() and revalidates the mapping. Splicing from direct I/O is handled by the caller. Signed-off-by: David Howells <dhowells@redhat.com> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Trond Myklebust <trond.myklebust@hammerspace.com> cc: Anna Schumaker <anna@kernel.org> cc: linux-nfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20230522135018.2742245-21-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24f2fs: Provide a splice-read wrapperDavid Howells1-8/+35
Provide a splice_read wrapper for f2fs. This does some checks and tracing before calling filemap_splice_read() and will update the iostats afterwards. Direct I/O is handled by the caller. Signed-off-by: David Howells <dhowells@redhat.com> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Jaegeuk Kim <jaegeuk@kernel.org> cc: Chao Yu <chao@kernel.org> cc: linux-f2fs-devel@lists.sourceforge.net cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20230522135018.2742245-20-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24ext4: Provide a splice-read wrapperDavid Howells1-1/+12
Provide a splice_read wrapper for Ext4. This does the inode shutdown check before proceeding. Splicing from DAX files and O_DIRECT fds is handled by the caller. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Theodore Ts'o <tytso@mit.edu> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Andreas Dilger <adilger.kernel@dilger.ca> cc: linux-ext4@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20230522135018.2742245-19-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24ecryptfs: Provide a splice-read wrapperDavid Howells1-1/+26
Provide a splice_read wrapper for ecryptfs to update the access time on the lower file after the operation. Splicing from a direct I/O fd will update the access time when ->read_iter() is called. Signed-off-by: David Howells <dhowells@redhat.com> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Tyler Hicks <code@tyhicks.com> cc: ecryptfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20230522135018.2742245-18-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24ceph: Provide a splice-read wrapperDavid Howells1-1/+64
Provide a splice_read wrapper for Ceph. This does the inode shutdown check before proceeding and jumps to copy_splice_read() if the file has inline data or is a synchronous file. We try and get FILE_RD and either FILE_CACHE and/or FILE_LAZYIO caps and hold them across filemap_splice_read(). If we fail to get FILE_CACHE or FILE_LAZYIO capabilities, we use copy_splice_read() instead. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Ilya Dryomov <idryomov@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: ceph-devel@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20230522135018.2742245-17-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>