summaryrefslogtreecommitdiffstats
path: root/fs/squashfs (follow)
Commit message (Collapse)AuthorAgeFilesLines
* squashfs: cache partial compressed blocksVincent Whitchurch2023-06-103-6/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before commit 93e72b3c612adcaca1 ("squashfs: migrate from ll_rw_block usage to BIO"), compressed blocks read by squashfs were cached in the page cache, but that is not the case after that commit. That has lead to squashfs having to re-read a lot of sectors from disk/flash. For example, the first sectors of every metadata block need to be read twice from the disk. Once partially to read the length, and a second time to read the block itself. Also, in linear reads of large files, the last sectors of one data block are re-read from disk when reading the next data block, since the compressed blocks are of variable sizes and not aligned to device blocks. This extra I/O results in a degrade in read performance of, for example, ~16% in one scenario on my ARM platform using squashfs with dm-verity and NAND. Since the decompressed data is cached in the page cache or squashfs' internal metadata and fragment caches, caching _all_ compressed pages would lead to a lot of double caching and is undesirable. But make the code cache any disk blocks which were only partially requested, since these are the ones likely to include data which is needed by other file system blocks. This restores read performance in my test scenario. The compressed block caching is only applied when the disk block size is equal to the page size, to avoid having to deal with caching sub-page reads. [akpm@linux-foundation.org: fs/squashfs/block.c needs linux/pagemap.h] [vincent.whitchurch@axis.com: fix page update race] Link: https://lkml.kernel.org/r/20230526-squashfs-cache-fixup-v1-1-d54a7fa23e7b@axis.com [vincent.whitchurch@axis.com: fix page indices] Link: https://lkml.kernel.org/r/20230526-squashfs-cache-fixup-v1-2-d54a7fa23e7b@axis.com [akpm@linux-foundation.org: fix layout, per hch] Link: https://lkml.kernel.org/r/20230510-squashfs-cache-v4-1-3bd394e1ee71@axis.com Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* squashfs: don't include buffer_head.hChristoph Hellwig2023-06-103-3/+0
| | | | | | | | | | | Squashfs has stopped using buffers heads in 93e72b3c612adcaca1 ("squashfs: migrate from ll_rw_block usage to BIO"). Link: https://lkml.kernel.org/r/20230517071622.245151-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Pankaj Raghav <p.raghav@samsung.com> Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* revert "squashfs: harden sanity check in squashfs_read_xattr_id_table"Andrew Morton2023-02-041-1/+1
| | | | | | | | | | | | This fix was nacked by Philip, for reasons identified in the email linked below. Link: https://lkml.kernel.org/r/68f15d67-8945-2728-1f17-5b53a80ec52d@squashfs.org.uk Fixes: 72e544b1b28325 ("squashfs: harden sanity check in squashfs_read_xattr_id_table") Cc: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Fedor Pchelkin <pchelkin@ispras.ru> Cc: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* Squashfs: fix handling and sanity checking of xattr_ids countPhillip Lougher2023-02-014-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A Sysbot [1] corrupted filesystem exposes two flaws in the handling and sanity checking of the xattr_ids count in the filesystem. Both of these flaws cause computation overflow due to incorrect typing. In the corrupted filesystem the xattr_ids value is 4294967071, which stored in a signed variable becomes the negative number -225. Flaw 1 (64-bit systems only): The signed integer xattr_ids variable causes sign extension. This causes variable overflow in the SQUASHFS_XATTR_*(A) macros. The variable is first multiplied by sizeof(struct squashfs_xattr_id) where the type of the sizeof operator is "unsigned long". On a 64-bit system this is 64-bits in size, and causes the negative number to be sign extended and widened to 64-bits and then become unsigned. This produces the very large number 18446744073709548016 or 2^64 - 3600. This number when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and divided by SQUASHFS_METADATA_SIZE overflows and produces a length of 0 (stored in len). Flaw 2 (32-bit systems only): On a 32-bit system the integer variable is not widened by the unsigned long type of the sizeof operator (32-bits), and the signedness of the variable has no effect due it always being treated as unsigned. The above corrupted xattr_ids value of 4294967071, when multiplied overflows and produces the number 4294963696 or 2^32 - 3400. This number when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and divided by SQUASHFS_METADATA_SIZE overflows again and produces a length of 0. The effect of the 0 length computation: In conjunction with the corrupted xattr_ids field, the filesystem also has a corrupted xattr_table_start value, where it matches the end of filesystem value of 850. This causes the following sanity check code to fail because the incorrectly computed len of 0 matches the incorrect size of the table reported by the superblock (0 bytes). len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids); indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids); /* * The computed size of the index table (len bytes) should exactly * match the table start and end points */ start = table_start + sizeof(*id_table); end = msblk->bytes_used; if (len != (end - start)) return ERR_PTR(-EINVAL); Changing the xattr_ids variable to be "usigned int" fixes the flaw on a 64-bit system. This relies on the fact the computation is widened by the unsigned long type of the sizeof operator. Casting the variable to u64 in the above macro fixes this flaw on a 32-bit system. It also means 64-bit systems do not implicitly rely on the type of the sizeof operator to widen the computation. [1] https://lore.kernel.org/lkml/000000000000cd44f005f1a0f17f@google.com/ Link: https://lkml.kernel.org/r/20230127061842.10965-1-phillip@squashfs.org.uk Fixes: 506220d2ba21 ("squashfs: add more sanity checks in xattr id lookup") Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: <syzbot+082fa4af80a5bb1a9843@syzkaller.appspotmail.com> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Fedor Pchelkin <pchelkin@ispras.ru> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* squashfs: harden sanity check in squashfs_read_xattr_id_tableFedor Pchelkin2023-02-011-1/+1
| | | | | | | | | | | | | | | | | | While mounting a corrupted filesystem, a signed integer '*xattr_ids' can become less than zero. This leads to the incorrect computation of 'len' and 'indexes' values which can cause null-ptr-deref in copy_bio_to_actor() or out-of-bounds accesses in the next sanity checks inside squashfs_read_xattr_id_table(). Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Link: https://lkml.kernel.org/r/20230117105226.329303-2-pchelkin@ispras.ru Fixes: 506220d2ba21 ("squashfs: add more sanity checks in xattr id lookup") Reported-by: <syzbot+082fa4af80a5bb1a9843@syzkaller.appspotmail.com> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* Merge tag 'fs.idmapped.squashfs.v6.2' of ↵Linus Torvalds2022-12-131-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull squashfs update from Seth Forshee: "This is a simple patch to enable idmapped mounts for squashfs. All functionality squashfs needs to support idmapped mounts is already implemented in generic VFS code, so all that is needed is to set FS_ALLOW_IDMAP in fs_flags" * tag 'fs.idmapped.squashfs.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: squashfs: enable idmapped mounts
| * squashfs: enable idmapped mountsMichael Weiß2022-11-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For squashfs all needed functionality for idmapped mounts is already implemented by the generic handlers in the VFS. Thus, it is sufficient to just enable the corresponding FS_ALLOW_IDMAP flag to support idmapped mounts. We use this for unprivileged (user namespaced) containers based on squashfs images as rootfs in GyroidOS. A simple test using the mount-idmapped tool executed as user with uid=1000 looks as follows: $ mkdir test $ echo "test" > test/test_file $ mksquashfs test/ fs.img $ sudo mkdir /mnt/test $ sudo mkdir /mnt/mapped $ sudo mount fs.img -o loop /mnt/test/ $ sudo ./mount-idmapped --map-mount b:1000:2000:1 /mnt/test/ /mnt/mapped/ $ mount | tail -n2 fs.img on /mnt/test type squashfs (ro,relatime,errors=continue) fs.img on /mnt/mapped type squashfs (ro,relatime,idmapped,errors=continue) $ ls -lan /mnt/test/ total 5 drwxr-xr-x 2 1000 1000 32 Okt 24 13:36 . drwxr-xr-x 6 0 0 4096 Okt 24 13:38 .. -rw-r--r-- 1 1000 1000 5 Okt 24 13:36 test_file $ ls -lan /mnt/mapped/ total 5 drwxr-xr-x 2 2000 2000 32 Okt 24 13:36 . drwxr-xr-x 6 0 0 4096 Okt 24 13:38 .. -rw-r--r-- 1 2000 2000 5 Okt 24 13:36 test_file Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
* | squashfs: fix null-ptr-deref in squashfs_fill_superBaokun Li2022-11-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When squashfs_read_table() returns an error or `sb->s_magic != SQUASHFS_MAGIC`, enters the error branch and calls msblk->thread_ops->destroy(msblk) to destroy msblk. However, msblk->thread_ops has not been initialized. Therefore, the following problem is triggered: ================================================================== BUG: KASAN: null-ptr-deref in squashfs_fill_super+0xe7a/0x13b0 Read of size 8 at addr 0000000000000008 by task swapper/0/1 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc3-next-20221031 #367 Call Trace: <TASK> dump_stack_lvl+0x73/0x9f print_report+0x743/0x759 kasan_report+0xc0/0x120 __asan_load8+0xd3/0x140 squashfs_fill_super+0xe7a/0x13b0 get_tree_bdev+0x27b/0x450 squashfs_get_tree+0x19/0x30 vfs_get_tree+0x49/0x150 path_mount+0xaae/0x1350 init_mount+0xad/0x100 do_mount_root+0xbc/0x1d0 mount_block_root+0x173/0x316 mount_root+0x223/0x236 prepare_namespace+0x1eb/0x237 kernel_init_freeable+0x528/0x576 kernel_init+0x29/0x250 ret_from_fork+0x1f/0x30 </TASK> ================================================================== To solve this issue, msblk->thread_ops is initialized immediately after msblk is assigned a value. Link: https://lkml.kernel.org/r/20221101073343.3961562-1-libaokun1@huawei.com Fixes: b0645770d3c7 ("squashfs: add the mount parameter theads=<single|multi|percpu>") Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Xiaoming Ni <nixiaoming@huawei.com> Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Yu Kuai <yukuai3@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | squashfs: allows users to configure the number of decompression threadsXiaoming Ni2022-11-184-10/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The maximum number of threads in the decompressor_multi.c file is fixed and cannot be adjusted according to user needs. Therefore, the mount parameter needs to be added to allow users to configure the number of threads as required. The upper limit is num_online_cpus() * 2. Link: https://lkml.kernel.org/r/20221019030930.130456-3-nixiaoming@huawei.com Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Jianguo Chen <chenjianguo3@huawei.com> Cc: Jubin Zhong <zhongjubin@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | squashfs: add the mount parameter theads=<single|multi|percpu>Xiaoming Ni2022-11-189-32/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series 'squashfs: Add the mount parameter "threads="'. Currently, Squashfs supports multiple decompressor parallel modes. However, this mode can be configured only during kernel building and does not support flexible selection during runtime. In the current patch set, the mount parameter "threads=" is added to allow users to select the parallel decompressor mode and configure the number of decompressors when mounting a file system. "threads=<single|multi|percpu|1|2|3|...>" The upper limit is num_online_cpus() * 2. This patch (of 2): Squashfs supports three decompression concurrency modes: Single-thread mode: concurrent reads are blocked and the memory overhead is small. Multi-thread mode/percpu mode: reduces concurrent read blocking but increases memory overhead. The corresponding schema must be fixed at compile time. During mounting, the concurrent decompression mode cannot be adjusted based on file read blocking. The mount parameter theads=<single|multi|percpu> is added to select the concurrent decompression mode of a single SquashFS file system image. Link: https://lkml.kernel.org/r/20221019030930.130456-1-nixiaoming@huawei.com Link: https://lkml.kernel.org/r/20221019030930.130456-2-nixiaoming@huawei.com Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Jianguo Chen <chenjianguo3@huawei.com> Cc: Jubin Zhong <zhongjubin@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | squashfs: fix buffer release race condition in readahead codePhillip Lougher2022-10-281-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a buffer release race condition, where the error value was used after release. Link: https://lkml.kernel.org/r/20221020223616.7571-4-phillip@squashfs.org.uk Fixes: b09a7a036d20 ("squashfs: support reading fragments in readahead call") Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Reported-by: Marc Miltenberger <marcmiltenberger@gmail.com> Cc: Dimitri John Ledkov <dimitri.ledkov@canonical.com> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Cc: Slade Watkins <srw@sladewatkins.net> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | squashfs: fix extending readahead beyond end of filePhillip Lougher2022-10-281-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The readahead code will try to extend readahead to the entire size of the Squashfs data block. But, it didn't take into account that the last block at the end of the file may not be a whole block. In this case, the code would extend readahead to beyond the end of the file, leaving trailing pages. Fix this by only requesting the expected number of pages. Link: https://lkml.kernel.org/r/20221020223616.7571-3-phillip@squashfs.org.uk Fixes: 8fc78b6fe24c ("squashfs: implement readahead") Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Reported-by: Marc Miltenberger <marcmiltenberger@gmail.com> Cc: Dimitri John Ledkov <dimitri.ledkov@canonical.com> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Cc: Slade Watkins <srw@sladewatkins.net> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | squashfs: fix read regression introduced in readahead codePhillip Lougher2022-10-283-4/+12
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "squashfs: fix some regressions introduced in the readahead code". This patchset fixes 3 regressions introduced by the recent readahead code changes. The first regression is causing "snaps" to randomly fail after a couple of hours or days, which how the regression came to light. This patch (of 3): If a file isn't a whole multiple of the page size, the last page will have trailing bytes unfilled. There was a mistake in the readahead code which did this. In particular it incorrectly assumed that the last page in the readahead page array (page[nr_pages - 1]) will always contain the last page in the block, which if we're at file end, will be the page that needs to be zero filled. But the readahead code may not return the last page in the block, which means it is unmapped and will be skipped by the decompressors (a temporary buffer used). In this case the zero filling code will zero out the wrong page, leading to data corruption. Fix this by by extending the "page actor" to return the last page if present, or NULL if a temporary buffer was used. Link: https://lkml.kernel.org/r/20221020223616.7571-1-phillip@squashfs.org.uk Link: https://lkml.kernel.org/r/20221020223616.7571-2-phillip@squashfs.org.uk Fixes: 8fc78b6fe24c ("squashfs: implement readahead") Link: https://lore.kernel.org/lkml/b0c258c3-6dcf-aade-efc4-d62a8b3a1ce2@alu.unizg.hr/ Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Tested-by: Slade Watkins <srw@sladewatkins.net> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Reported-by: Marc Miltenberger <marcmiltenberger@gmail.com> Cc: Dimitri John Ledkov <dimitri.ledkov@canonical.com> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* squashfs: don't call kmalloc in decompressorsPhillip Lougher2022-08-284-21/+22
| | | | | | | | | | | | | | | The decompressors may be called while in an atomic section. So move the kmalloc() out of this path, and into the "page actor" init function. This fixes a regression introduced by commit f268eedddf35 ("squashfs: extend "page actor" to handle missing pages") Link: https://lkml.kernel.org/r/20220822215430.15933-1-phillip@squashfs.org.uk Fixes: f268eedddf35 ("squashfs: extend "page actor" to handle missing pages") Reported-by: Chris Murphy <lists@colorremedies.com> Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* Merge tag 'mm-nonmm-stable-2022-08-06-2' of ↵Linus Torvalds2022-08-0713-168/+264
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc updates from Andrew Morton: "Updates to various subsystems which I help look after. lib, ocfs2, fatfs, autofs, squashfs, procfs, etc. A relatively small amount of material this time" * tag 'mm-nonmm-stable-2022-08-06-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (72 commits) scripts/gdb: ensure the absolute path is generated on initial source MAINTAINERS: kunit: add David Gow as a maintainer of KUnit mailmap: add linux.dev alias for Brendan Higgins mailmap: update Kirill's email profile: setup_profiling_timer() is moslty not implemented ocfs2: fix a typo in a comment ocfs2: use the bitmap API to simplify code ocfs2: remove some useless functions lib/mpi: fix typo 'the the' in comment proc: add some (hopefully) insightful comments bdi: remove enum wb_congested_state kernel/hung_task: fix address space of proc_dohung_task_timeout_secs lib/lzo/lzo1x_compress.c: replace ternary operator with min() and min_t() squashfs: support reading fragments in readahead call squashfs: implement readahead squashfs: always build "file direct" version of page actor Revert "squashfs: provide backing_dev_info in order to disable read-ahead" fs/ocfs2: Fix spelling typo in comment ia64: old_rr4 added under CONFIG_HUGETLB_PAGE proc: fix test for "vsyscall=xonly" boot option ...
| * squashfs: support reading fragments in readahead callPhillip Lougher2022-07-301-3/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a function which can be used to read fragments in the readahead call. This function is necessary because filesystems built with the -tailends (or -always-use-fragments) option may have fragments present which cannot be currently handled. Link: https://lkml.kernel.org/r/20220617083810.337573-5-hsinyi@chromium.org Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Hou Tao <houtao1@huawei.com> Cc: kernel test robot <lkp@intel.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miao Xie <miaoxie@huawei.com> Cc: Xiongwei Song <Xiongwei.Song@windriver.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Zheng Liang <zhengliang6@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * squashfs: implement readaheadHsin-Yi Wang2022-07-301-1/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement readahead callback for squashfs. It will read datablocks which cover pages in readahead request. For a few cases it will not mark page as uptodate, including: - file end is 0. - zero filled blocks. - current batch of pages isn't in the same datablock. - decompressor error. Otherwise pages will be marked as uptodate. The unhandled pages will be updated by readpage later. Link: https://lkml.kernel.org/r/20220617083810.337573-4-hsinyi@chromium.org Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Suggested-by: Matthew Wilcox <willy@infradead.org> Reported-by: Matthew Wilcox <willy@infradead.org> Reported-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: Xiongwei Song <Xiongwei.Song@windriver.com> Reported-by: Andrew Morton <akpm@linux-foundation.org> Cc: Hou Tao <houtao1@huawei.com> Cc: kernel test robot <lkp@intel.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Zheng Liang <zhengliang6@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * squashfs: always build "file direct" version of page actorPhillip Lougher2022-07-302-48/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Squashfs_readahead uses the "file direct" version of the page actor, and so build it unconditionally. Link: https://lkml.kernel.org/r/20220617083810.337573-3-hsinyi@chromium.org Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Hou Tao <houtao1@huawei.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miao Xie <miaoxie@huawei.com> Cc: Xiongwei Song <Xiongwei.Song@windriver.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Zheng Liang <zhengliang6@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * Revert "squashfs: provide backing_dev_info in order to disable read-ahead"Hsin-Yi Wang2022-07-301-33/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "Implement readahead for squashfs", v7. Commit 9eec1d897139("squashfs: provide backing_dev_info in order to disable read-ahead") mitigates the performance drop issue for squashfs by closing readahead for it. This series implements readahead callback for squashfs. This patch (of 4): This reverts 9eec1d897139e5 ("squashfs: provide backing_dev_info in order to disable read-ahead"). Revert closing the readahead to squashfs since the readahead callback for squashfs is implemented. Link: https://lkml.kernel.org/r/20220617083810.337573-1-hsinyi@chromium.org Link: https://lkml.kernel.org/r/20220617083810.337573-2-hsinyi@chromium.org Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Suggested-by: Xiongwei Song <Xiongwei.Song@windriver.com> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Zheng Liang <zhengliang6@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Hou Tao <houtao1@huawei.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * squashfs: don't use intermediate buffer if pages missingPhillip Lougher2022-06-171-63/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | Now that the "page actor" can handle missing pages, we don't have to fall back to using an intermediate buffer in Squashfs_readpage_block() if all the pages necessary can't be obtained. Link: https://lkml.kernel.org/r/20220611032133.5743-3-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Xiongwei Song <Xiongwei.Song@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * squashfs: extend "page actor" to handle missing pagesPhillip Lougher2022-06-1710-31/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "Squashfs: handle missing pages decompressing into page cache". This patchset enables Squashfs to handle missing pages when directly decompressing datablocks into the page cache. Previously if the full set of pages needed was not available, Squashfs would have to fall back to using an intermediate buffer (the older method), which is slower, involving a memcopy, and it introduces contention on a shared buffer. The first patch extends the "page actor" code to handle missing pages. The second patch updates Squashfs_readpage_block() to use the new functionality, and removes the code that falls back to using an intermediate buffer. This patchset is independent of the readahead work, and it is standalone. It can be merged on its own. But the readahead patch for efficiency also needs this patch-set. This patch (of 2): This patch extends the "page actor" code to handle missing pages. Previously if the full set of pages needed to decompress a Squashfs datablock was unavailable, this would cause decompression to fail on the missing pages. In this case direct decompression into the page cache could not be achieved and the code would fall back to using the older intermediate buffer method. With this patch, direct decompression into the page cache can be achieved with missing pages. For "multi-shot" decompressors (zlib, xz, zstd), the page actor will allocate a temporary buffer which is passed to the decompressor, and then freed by the page actor. For "single shot" decompressors (lz4, lzo) which decompress into a contiguous "bounce buffer", and which is then copied into the page cache, it would be pointless to allocate a temporary buffer, memcpy into it, and then free it. For these decompressors -ENOMEM is returned, which signifies that the memcpy for that page should be skipped. This also happens if the data block is uncompressed. Link: https://lkml.kernel.org/r/20220611032133.5743-1-phillip@squashfs.org.uk Link: https://lkml.kernel.org/r/20220611032133.5743-2-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Xiongwei Song <Xiongwei.Song@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | squashfs: Return the actual error from squashfs_read_folio()Matthew Wilcox (Oracle)2022-08-021-7/+8
|/ | | | | | | | | Since we actually know what error happened, we can report it instead of having the generic code return -EIO for pages that were unlocked without being marked uptodate. Also remove a test of PageError since we have the return value at this point. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
* Merge tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds2022-05-253-5/+7
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull page cache updates from Matthew Wilcox: - Appoint myself page cache maintainer - Fix how scsicam uses the page cache - Use the memalloc_nofs_save() API to replace AOP_FLAG_NOFS - Remove the AOP flags entirely - Remove pagecache_write_begin() and pagecache_write_end() - Documentation updates - Convert several address_space operations to use folios: - is_dirty_writeback - readpage becomes read_folio - releasepage becomes release_folio - freepage becomes free_folio - Change filler_t to require a struct file pointer be the first argument like ->read_folio * tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecache: (107 commits) nilfs2: Fix some kernel-doc comments Appoint myself page cache maintainer fs: Remove aops->freepage secretmem: Convert to free_folio nfs: Convert to free_folio orangefs: Convert to free_folio fs: Add free_folio address space operation fs: Convert drop_buffers() to use a folio fs: Change try_to_free_buffers() to take a folio jbd2: Convert release_buffer_page() to use a folio jbd2: Convert jbd2_journal_try_to_free_buffers to take a folio reiserfs: Convert release_buffer_page() to use a folio fs: Remove last vestiges of releasepage ubifs: Convert to release_folio reiserfs: Convert to release_folio orangefs: Convert to release_folio ocfs2: Convert to release_folio nilfs2: Remove comment about releasepage nfs: Convert to release_folio jfs: Convert to release_folio ...
| * squashfs: Convert squashfs to read_folioMatthew Wilcox (Oracle)2022-05-093-5/+7
| | | | | | | | | | | | | | | | This is a "weak" conversion which converts straight back to using pages. A full conversion should be performed at some point, hopefully by someone familiar with the filesystem. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
* | block: turn bio_kmalloc into a simple kmalloc wrapperChristoph Hellwig2022-04-181-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the magic autofree semantics and require the callers to explicitly call bio_init to initialize the bio. This allows bio_free to catch accidental bio_put calls on bio_init()ed bios as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Coly Li <colyli@suse.de> Acked-by: Mike Snitzer <snitzer@kernel.org> Link: https://lore.kernel.org/r/20220406061228.410163-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | squashfs: always use bio_kmalloc in squashfs_bio_readChristoph Hellwig2022-04-181-8/+3
|/ | | | | | | | | | | | If a plain kmalloc that is not backed by a mempool is safe here for a large read (and the actual page allocations), it must also be for a small one, so simplify the code a bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Phillip Lougher <phillip@squashfs.org.uk> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220406061228.410163-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Merge branch 'akpm' (patches from Andrew)Linus Torvalds2022-03-231-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge updates from Andrew Morton: - A few misc subsystems: kthread, scripts, ntfs, ocfs2, block, and vfs - Most the MM patches which precede the patches in Willy's tree: kasan, pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap, sparsemem, vmalloc, pagealloc, memory-failure, mlock, hugetlb, userfaultfd, vmscan, compaction, mempolicy, oom-kill, migration, thp, cma, autonuma, psi, ksm, page-poison, madvise, memory-hotplug, rmap, zswap, uaccess, ioremap, highmem, cleanups, kfence, hmm, and damon. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (227 commits) mm/damon/sysfs: remove repeat container_of() in damon_sysfs_kdamond_release() Docs/ABI/testing: add DAMON sysfs interface ABI document Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface selftests/damon: add a test for DAMON sysfs interface mm/damon/sysfs: support DAMOS stats mm/damon/sysfs: support DAMOS watermarks mm/damon/sysfs: support schemes prioritization mm/damon/sysfs: support DAMOS quotas mm/damon/sysfs: support DAMON-based Operation Schemes mm/damon/sysfs: support the physical address space monitoring mm/damon/sysfs: link DAMON for virtual address spaces monitoring mm/damon: implement a minimal stub for sysfs-based DAMON interface mm/damon/core: add number of each enum type values mm/damon/core: allow non-exclusive DAMON start/stop Docs/damon: update outdated term 'regions update interval' Docs/vm/damon/design: update DAMON-Idle Page Tracking interference handling Docs/vm/damon: call low level monitoring primitives the operations mm/damon: remove unnecessary CONFIG_DAMON option mm/damon/paddr,vaddr: remove damon_{p,v}a_{target_valid,set_operations}() mm/damon/dbgfs-test: fix is_target_id() change ...
| * fs: allocate inode by using alloc_inode_sb()Muchun Song2022-03-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The inode allocation is supposed to use alloc_inode_sb(), so convert kmem_cache_alloc() of all filesystems to alloc_inode_sb(). Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Theodore Ts'o <tytso@mit.edu> [ext4] Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Alex Shi <alexs@kernel.org> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: Chao Yu <chao@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kari Argillander <kari.argillander@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | block: pass a block_device and opf to bio_allocChristoph Hellwig2022-02-021-5/+6
|/ | | | | | | | | | | | | | | Pass the block_device and operation that we plan to use this bio for to bio_alloc to optimize the assignment. NULL/0 can be passed, both for the passthrough case on a raw request_queue and to temporarily avoid refactoring some nasty code. Also move the gfp_mask argument after the nr_vecs argument for a much more logical calling convention matching what most of the kernel does. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220124091107.642561-18-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* squashfs: provide backing_dev_info in order to disable read-aheadZheng Liang2022-01-151-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit c1f6925e1091 ("mm: put readahead pages in cache earlier") causes the read performance of squashfs to deteriorate.Through testing, we find that the performance will be back by closing the readahead of squashfs. So we want to learn the way of ubifs, provides backing_dev_info and disable read-ahead We tested the following data by fio. squashfs image blocksize=128K test command: fio --name basic --bs=? --filename="/mnt/test_file" --rw=? --iodepth=1 --ioengine=psync --runtime=200 --time_based turn on squashfs readahead in 5.10 kernel bs(k) read/randread MB/s 4 randread 271 128 randread 231 1024 randread 246 4 read 310 128 read 245 1024 read 247 turn off squashfs readahead in 5.10 kernel bs(k) read/randread MB/s 4 randread 293 128 randread 330 1024 randread 363 4 read 338 128 read 360 1024 read 365 turn on squashfs readahead and revert the commit c1f6925e1091("mm: put readahead pages in cache earlier") in 5.10 kernel bs(k) read/randread MB/s 4 randread 289 128 randread 306 1024 randread 335 4 read 337 128 read 336 1024 read 338 Link: https://lkml.kernel.org/r/20211116113141.1391026-1-zhengliang6@huawei.com Signed-off-by: Zheng Liang <zhengliang6@huawei.com> Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Hou Tao <houtao1@huawei.com> Cc: Miao Xie <miaoxie@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib: zstd: Add kernel-specific APINick Terrell2021-11-091-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch: - Moves `include/linux/zstd.h` -> `include/linux/zstd_lib.h` - Updates modified zstd headers to yearless copyright - Adds a new API in `include/linux/zstd.h` that is functionally equivalent to the in-use subset of the current API. Functions are renamed to avoid symbol collisions with zstd, to make it clear it is not the upstream zstd API, and to follow the kernel style guide. - Updates all callers to use the new API. There are no functional changes in this patch. Since there are no functional change, I felt it was okay to update all the callers in a single patch. Once the API is approved, the callers are mechanically changed. This patch is preparing for the 3rd patch in this series, which updates zstd to version 1.4.10. Since the upstream zstd API is no longer exposed to callers, the update can happen transparently. Signed-off-by: Nick Terrell <terrelln@fb.com> Tested By: Paul Jones <paul@pauljones.id.au> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64 Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
* squashfs: use bdev_nr_bytes instead of open coding itChristoph Hellwig2021-10-181-2/+3
| | | | | | | | | | Use the proper helper to read the block device size. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Phillip Lougher <phillip@squashfs.org.uk> Link: https://lore.kernel.org/r/20211018101130.1838532-24-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* squashfs: use bvec_virtChristoph Hellwig2021-08-166-9/+8
| | | | | | | | Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210804095634.460779-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* squashfs: add option to panic on errorsVincent Whitchurch2021-06-293-1/+91
| | | | | | | | | | | | | | | Add an errors=panic mount option to make squashfs trigger a panic when errors are encountered, similar to several other filesystems. This allows a kernel dump to be saved using which the corruption can be analysed and debugged. Inspired by a pre-fs_context patch by Anton Eliasson. Link: https://lkml.kernel.org/r/20210527125019.14511-1-vincent.whitchurch@axis.com Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* squashfs: fix divide error in calculate_skip()Phillip Lougher2021-05-151-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Sysbot has reported a "divide error" which has been identified as being caused by a corrupted file_size value within the file inode. This value has been corrupted to a much larger value than expected. Calculate_skip() is passed i_size_read(inode) >> msblk->block_log. Due to the file_size value corruption this overflows the int argument/variable in that function, leading to the divide error. This patch changes the function to use u64. This will accommodate any unexpectedly large values due to corruption. The value returned from calculate_skip() is clamped to be never more than SQUASHFS_CACHED_BLKS - 1, or 7. So file_size corruption does not lead to an unexpectedly large return result here. Link: https://lkml.kernel.org/r/20210507152618.9447-1-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: <syzbot+e8f781243ce16ac2f962@syzkaller.appspotmail.com> Reported-by: <syzbot+7b98870d4fec9447b951@syzkaller.appspotmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* squashfs: fix xattr id and id lookup sanity checksPhillip Lougher2021-03-252-4/+8
| | | | | | | | | | | | | The checks for maximum metadata block size is missing SQUASHFS_BLOCK_OFFSET (the two byte length count). Link: https://lkml.kernel.org/r/2069685113.2081245.1614583677427@webmail.123-reg.co.uk Fixes: f37aa4c7366e23f ("squashfs: add more sanity checks in id lookup") Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Sean Nyekjaer <sean@geanix.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* squashfs: fix inode lookup sanity checksSean Nyekjaer2021-03-252-2/+7
| | | | | | | | | | | | | | | | When mouting a squashfs image created without inode compression it fails with: "unable to read inode lookup table" It turns out that the BLOCK_OFFSET is missing when checking the SQUASHFS_METADATA_SIZE agaist the actual size. Link: https://lkml.kernel.org/r/20210226092903.1473545-1-sean@geanix.com Fixes: eabac19e40c0 ("squashfs: add more sanity checks in inode lookup") Signed-off-by: Sean Nyekjaer <sean@geanix.com> Acked-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* block: rename BIO_MAX_PAGES to BIO_MAX_VECSChristoph Hellwig2021-03-111-1/+1
| | | | | | | | | | | | Ever since the addition of multipage bio_vecs BIO_MAX_PAGES has been horribly confusingly misnamed. Rename it to BIO_MAX_VECS to stop confusing users of the bio API. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20210311110137.1132391-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* squashfs: add more sanity checks in xattr id lookupPhillip Lougher2021-02-101-9/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sysbot has reported a warning where a kmalloc() attempt exceeds the maximum limit. This has been identified as corruption of the xattr_ids count when reading the xattr id lookup table. This patch adds a number of additional sanity checks to detect this corruption and others. 1. It checks for a corrupted xattr index read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This would cause an out of bounds read. 2. It checks against corruption of the xattr_ids count. This can either lead to the above kmalloc failure, or a smaller than expected table to be read. 3. It checks the contents of the index table for corruption. [phillip@squashfs.org.uk: fix checkpatch issue] Link: https://lkml.kernel.org/r/270245655.754655.1612770082682@webmail.123-reg.co.uk Link: https://lkml.kernel.org/r/20210204130249.4495-5-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: syzbot+2ccea6339d368360800d@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* squashfs: add more sanity checks in inode lookupPhillip Lougher2021-02-101-8/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sysbot has reported an "slab-out-of-bounds read" error which has been identified as being caused by a corrupted "ino_num" value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This patch adds additional sanity checks to detect this, and the following corruption. 1. It checks against corruption of the inodes count. This can either lead to a larger table to be read, or a smaller than expected table to be read. In the case of a too large inodes count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values. 2. It checks the contents of the index table for corruption. [phillip@squashfs.org.uk: fix checkpatch issue] Link: https://lkml.kernel.org/r/527909353.754618.1612769948607@webmail.123-reg.co.uk Link: https://lkml.kernel.org/r/20210204130249.4495-4-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: syzbot+04419e3ff19d2970ea28@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* squashfs: add more sanity checks in id lookupPhillip Lougher2021-02-104-12/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sysbot has reported a number of "slab-out-of-bounds reads" and "use-after-free read" errors which has been identified as being caused by a corrupted index value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This patch adds additional sanity checks to detect this, and the following corruption. 1. It checks against corruption of the ids count. This can either lead to a larger table to be read, or a smaller than expected table to be read. In the case of a too large ids count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values. 2. It checks the contents of the index table for corruption. Link: https://lkml.kernel.org/r/20210204130249.4495-3-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: syzbot+b06d57ba83f604522af2@syzkaller.appspotmail.com Reported-by: syzbot+c021ba012da41ee9807c@syzkaller.appspotmail.com Reported-by: syzbot+5024636e8b5fd19f0f19@syzkaller.appspotmail.com Reported-by: syzbot+bcbc661df46657d0fa4f@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* squashfs: avoid out of bounds writes in decompressorsPhillip Lougher2021-02-101-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "Squashfs: fix BIO migration regression and add sanity checks". Patch [1/4] fixes a regression introduced by the "migrate from ll_rw_block usage to BIO" patch, which has produced a number of Sysbot/Syzkaller reports. Patches [2/4], [3/4], and [4/4] fix a number of filesystem corruption issues which have produced Sysbot reports in the id, inode and xattr lookup code. Each patch has been tested against the Sysbot reproducers using the given kernel configuration. They have the appropriate "Reported-by:" lines added. Additionally, all of the reproducer filesystems are indirectly fixed by patch [4/4] due to the fact they all have xattr corruption which is now detected there. Additional testing with other configurations and architectures (32bit, big endian), and normal filesystems has also been done to trap any inadvertent regressions caused by the additional sanity checks. This patch (of 4): This is a regression introduced by the patch "migrate from ll_rw_block usage to BIO". Sysbot/Syskaller has reported a number of "out of bounds writes" and "unable to handle kernel paging request in squashfs_decompress" errors which have been identified as a regression introduced by the above patch. Specifically, the patch removed the following sanity check if (length < 0 || length > output->length || (index + length) > msblk->bytes_used) This check did two things: 1. It ensured any reads were not beyond the end of the filesystem 2. It ensured that the "length" field read from the filesystem was within the expected maximum length. Without this any corrupted values can over-run allocated buffers. Link: https://lkml.kernel.org/r/20210204130249.4495-1-phillip@squashfs.org.uk Link: https://lkml.kernel.org/r/20210204130249.4495-2-phillip@squashfs.org.uk Fixes: 93e72b3c612adc ("squashfs: migrate from ll_rw_block usage to BIO") Reported-by: syzbot+6fba78f99b9afd4b5634@syzkaller.appspotmail.com Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Philippe Liard <pliard@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'work.misc' of ↵Linus Torvalds2020-10-241-2/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Assorted stuff all over the place (the largest group here is Christoph's stat cleanups)" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: remove KSTAT_QUERY_FLAGS fs: remove vfs_stat_set_lookup_flags fs: move vfs_fstatat out of line fs: implement vfs_stat and vfs_lstat in terms of vfs_fstatat fs: remove vfs_statx_fd fs: omfs: use kmemdup() rather than kmalloc+memcpy [PATCH] reduce boilerplate in fsid handling fs: Remove duplicated flag O_NDELAY occurring twice in VALID_OPEN_FLAGS selftests: mount: add nosymfollow tests Add a "nosymfollow" mount option.
| * [PATCH] reduce boilerplate in fsid handlingAl Viro2020-09-181-2/+1
| | | | | | | | | | | | | | Get rid of boilerplate in most of ->statfs() instances... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | squashfs: avoid bio_alloc() failure with 1Mbyte blocksPhillip Lougher2020-08-211-1/+5
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | This is a regression introduced by the patch "migrate from ll_rw_block usage to BIO". Bio_alloc() is limited to 256 pages (1 Mbyte). This can cause a failure when reading 1 Mbyte block filesystems. The problem is a datablock can be fully (or almost uncompressed), requiring 256 pages, but, because blocks are not aligned to page boundaries, it may require 257 pages to read. Bio_kmalloc() can handle 1024 pages, and so use this for the edge condition. Fixes: 93e72b3c612a ("squashfs: migrate from ll_rw_block usage to BIO") Reported-by: Nicolas Prochazka <nicolas.prochazka@gmail.com> Reported-by: Tomoatsu Shimada <shimada@walbrix.com> Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Guenter Roeck <groeck@chromium.org> Cc: Philippe Liard <pliard@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Adrien Schildknecht <adrien+dev@schischi.me> Cc: Daniel Rosenberg <drosen@google.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200815035637.15319-1-phillip@squashfs.org.uk Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* squashfs: fix length field overlap check in metadata readingPhillip Lougher2020-07-241-1/+1
| | | | | | | | | | | | | | | | | | | | This is a regression introduced by the "migrate from ll_rw_block usage to BIO" patch. Squashfs packs structures on byte boundaries, and due to that the length field (of the metadata block) may not be fully in the current block. The new code rewrote and introduced a faulty check for that edge case. Fixes: 93e72b3c612adcaca1 ("squashfs: migrate from ll_rw_block usage to BIO") Reported-by: Bernd Amend <bernd.amend@gmail.com> Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Adrien Schildknecht <adrien+dev@schischi.me> Cc: Guenter Roeck <groeck@chromium.org> Cc: Daniel Rosenberg <drosen@google.com> Link: http://lkml.kernel.org/r/20200717195536.16069-1-phillip@squashfs.org.uk Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Squashfs: Replace zero-length array with flexible-arrayGustavo A. R. Silva2020-06-161-8/+8
| | | | | | | | | | | | There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
* Merge branch 'akpm' (patches from Andrew)Linus Torvalds2020-06-0211-239/+281
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge updates from Andrew Morton: "A few little subsystems and a start of a lot of MM patches. Subsystems affected by this patch series: squashfs, ocfs2, parisc, vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup, swap, memcg, pagemap, memory-failure, vmalloc, kasan" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits) kasan: move kasan_report() into report.c mm/mm_init.c: report kasan-tag information stored in page->flags ubsan: entirely disable alignment checks under UBSAN_TRAP kasan: fix clang compilation warning due to stack protector x86/mm: remove vmalloc faulting mm: remove vmalloc_sync_(un)mappings() x86/mm/32: implement arch_sync_kernel_mappings() x86/mm/64: implement arch_sync_kernel_mappings() mm/ioremap: track which page-table levels were modified mm/vmalloc: track which page-table levels were modified mm: add functions to track page directory modifications s390: use __vmalloc_node in stack_alloc powerpc: use __vmalloc_node in alloc_vm_stack arm64: use __vmalloc_node in arch_alloc_vmap_stack mm: remove vmalloc_user_node_flags mm: switch the test_vmalloc module to use __vmalloc_node mm: remove __vmalloc_node_flags_caller mm: remove both instances of __vmalloc_node_flags mm: remove the prot argument to __vmalloc_node mm: remove the pgprot argument to __vmalloc ...
| * squashfs: migrate from ll_rw_block usage to BIOPhilippe Liard2020-06-0211-242/+287
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ll_rw_block() function has been deprecated in favor of BIO which appears to come with large performance improvements. This patch decreases boot time by close to 40% when using squashfs for the root file-system. This is observed at least in the context of starting an Android VM on Chrome OS using crosvm. The patch was tested on 4.19 as well as master. This patch is largely based on Adrien Schildknecht's patch that was originally sent as https://lkml.org/lkml/2017/9/22/814 though with some significant changes and simplifications while also taking Phillip Lougher's feedback into account, around preserving support for FILE_CACHE in particular. [akpm@linux-foundation.org: fix build error reported by Randy] Link: http://lkml.kernel.org/r/319997c2-5fc8-f889-2ea3-d913308a7c1f@infradead.org Signed-off-by: Philippe Liard <pliard@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Adrien Schildknecht <adrien+dev@schischi.me> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: Guenter Roeck <groeck@chromium.org> Cc: Daniel Rosenberg <drosen@google.com> Link: https://chromium.googlesource.com/chromiumos/platform/crosvm Link: http://lkml.kernel.org/r/20191106074238.186023-1-pliard@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | squashfs: Make use of local lock in multi_cpu decompressorJulia Cartwright2020-05-281-7/+14
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | The squashfs multi CPU decompressor makes use of get_cpu_ptr() to acquire a pointer to per-CPU data. get_cpu_ptr() implicitly disables preemption which serializes the access to the per-CPU data. But decompression can take quite some time depending on the size. The observed preempt disabled times in real world scenarios went up to 8ms, causing massive wakeup latencies. This happens on all CPUs as the decompression is fully parallelized. Replace the implicit preemption control with an explicit local lock. This allows RT kernels to substitute it with a real per CPU lock, which serializes the access but keeps the code section preemptible. On non RT kernels this maps to preempt_disable() as before, i.e. no functional change. [ bigeasy: Use local_lock(), patch description] Reported-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Julia Cartwright <julia@ni.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20200527201119.1692513-5-bigeasy@linutronix.de