| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
The pattern prefixed with '/' matches files in the same directory,
but not ones in sub-directories.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Andra Paraschiv <andraprs@amazon.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Gabriel Krisman Bertazi <krisman@collabora.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LANG gives a weak default to each LC_* in case it is not explicitly
defined. LC_ALL, if set, overrides all other LC_* variables.
LANG < LC_CTYPE, LC_COLLATE, LC_MONETARY, LC_NUMERIC, ... < LC_ALL
This is why documentation such as [1] suggests to set LC_ALL in build
scripts to get the deterministic result.
LANG=C is not strong enough to override LC_* that may be set by end
users.
[1]: https://reproducible-builds.org/docs/locales/
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Matthias Maennich <maennich@google.com>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> (mptcp)
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, -Wunused-but-set-variable is only supported by GCC so it is
disabled unconditionally in a GCC only block (it is enabled with W=1).
clang currently has its implementation for this warning in review so
preemptively move this statement out of the GCC only block and wrap it
with cc-disable-warning so that both compilers function the same.
Cc: stable@vger.kernel.org
Link: https://reviews.llvm.org/D100581
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We maintain .gitignore and Makefiles so build artifacts are properly
ignored by Git, and cleaned up by 'make clean'. However, the code is
always changing; generated files are often moved to another directory,
or removed when they become unnecessary. Such garbage files tend to be
left over in the source tree because people usually git-pull without
cleaning the tree.
This is not only the noise for 'git status', but also a build issue
in some cases.
One solution is to remove a stale file like commit 223c24a7dba9 ("kbuild:
Automatically remove stale <linux/version.h> file") did. Such workaround
should be removed after a while, but we forget about that if we scatter
the workaround code in random places.
So, this commit adds a new script to collect cleanings of stale files.
As a start point, move the code in arch/arm/boot/compressed/Makefile
into this script.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the timestamp of the .config file is updated, config_data.gz is
regenerated, then vmlinux is re-linked. This occurs even if the content
of the .config has not changed at all.
This issue was mitigated by commit 67424f61f813 ("kconfig: do not write
.config if the content is the same"); Kconfig does not update the
.config when it ends up with the identical configuration.
The issue is remaining when the .config is created by *_defconfig with
some config fragment(s) applied on top.
This is typical for powerpc and mips, where several *_defconfig targets
are constructed by using merge_config.sh.
One workaround is to have the copy of the .config. The filechk rule
updates the copy, kernel/config_data, by checking the content instead
of the timestamp.
With this commit, the second run with the same configuration avoids
the needless rebuilds.
$ make ARCH=mips defconfig all
[ snip ]
$ make ARCH=mips defconfig all
*** Default configuration is based on target '32r2el_defconfig'
Using ./arch/mips/configs/generic_defconfig as base
Merging arch/mips/configs/generic/32r2.config
Merging arch/mips/configs/generic/el.config
Merging ./arch/mips/configs/generic/board-boston.config
Merging ./arch/mips/configs/generic/board-ni169445.config
Merging ./arch/mips/configs/generic/board-ocelot.config
Merging ./arch/mips/configs/generic/board-ranchu.config
Merging ./arch/mips/configs/generic/board-sead-3.config
Merging ./arch/mips/configs/generic/board-xilfpga.config
#
# configuration written to .config
#
SYNC include/config/auto.conf
CALL scripts/checksyscalls.sh
CALL scripts/atomic/check-atomics.sh
CHK include/generated/compile.h
CHK include/generated/autoksyms.h
Reported-by: Elliot Berman <eberman@codeaurora.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
modules.builtin used to be created in every directory.
Since commit 8b41fc4454e3 ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf"), modules.builtin is created only
in the top directory.
Add the '/' prefix so that it matches to only the modules.builtin located
in the top directory.
It has been more than one year since that change. I hope this will not
flood 'Untracked files' of 'git status'.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
|
|
| |
For consistency, move tags and TAGS close to the cscope and GNU Global
patterns.
I removed the '/' prefix in case somebody wants to manually create tag
files in sub-directories.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
timeconst.h and hz.bc used to exist in kernel/.
Commit 5cee96459726 ("time/timers: Move all time(r) related files into
kernel/time") moved them to kernel/time/.
Commit 0a227985d4a9 ("time: Move timeconst.h into include/generated")
moved timeconst.h to include/generated/ and removed hz.bc .
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
| |
The current .gitignore intends to ignore everything under usr/include/
except .gitignore and Makefile.
A cleaner solution is to use a pattern suffixed with '/', which matches
only directories. It works well here because all the exported headers
are located in sub-directories, like <linux/*.h>, <asm/*.h>.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
|
|
| |
(shipped source) is a stale comment.
Since commit 833e62245943 ("genksyms: generate lexer and parser during
build instead of shipping"), there is no source file to be shipped in
this directory.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cmd_shipped uses 'cat' instead of 'cp' for copying a file. The reason
is explained in the commit [1], but it was in the pre-git era.
$ touch a
$ chmod -w a
$ cp a b
$ cp a b
cp: cannot create regular file 'b': Permission denied
Add comments so that you can see the reason without looking into
the history.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=a70dba8086160449cc94c5bdaff78419b6b8e3c8
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
| |
Many architectures duplicate similar shell scripts.
This commit converts sparc to use scripts/syscallhdr.sh.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
|
| |
Many architectures duplicate similar shell scripts.
This commit converts sparc to use scripts/syscalltbl.sh. This also
unifies syscall_table_64.h and syscall_table_c32.h.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
| |
Many architectures duplicate similar shell scripts.
This commit converts sh to use scripts/syscallhdr.sh.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
| |
Many architectures duplicate similar shell scripts.
This commit converts sh to use scripts/syscalltbl.sh.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"New features for ext4 this cycle include support for encrypted
casefold, ensure that deleted file names are cleared in directory
blocks by zeroing directory entries when they are unlinked or moved as
part of a hash tree node split. We also improve the block allocator's
performance on a freshly mounted file system by prefetching block
bitmaps.
There are also the usual cleanups and bug fixes, including fixing a
page cache invalidation race when there is mixed buffered and direct
I/O and the block size is less than page size, and allow the dax flag
to be set and cleared on inline directories"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (32 commits)
ext4: wipe ext4_dir_entry2 upon file deletion
ext4: Fix occasional generic/418 failure
fs: fix reporting supported extra file attributes for statx()
ext4: allow the dax flag to be set and cleared on inline directories
ext4: fix debug format string warning
ext4: fix trailing whitespace
ext4: fix various seppling typos
ext4: fix error return code in ext4_fc_perform_commit()
ext4: annotate data race in jbd2_journal_dirty_metadata()
ext4: annotate data race in start_this_handle()
ext4: fix ext4_error_err save negative errno into superblock
ext4: fix error code in ext4_commit_super
ext4: always panic when errors=panic is specified
ext4: delete redundant uptodate check for buffer
ext4: do not set SB_ACTIVE in ext4_orphan_cleanup()
ext4: make prefetch_block_bitmaps default
ext4: add proc files to monitor new structures
ext4: improve cr 0 / cr 1 group scanning
ext4: add MB_NUM_ORDERS macro
ext4: add mballoc stats proc file
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Upon file deletion, zero out all fields in ext4_dir_entry2 besides rec_len.
In case sensitive data is stored in filenames, this ensures no potentially
sensitive data is left in the directory entry upon deletion. Also, wipe
these fields upon moving a directory entry during the conversion to an
htree and when splitting htree nodes.
The data wiped may still exist in the journal, but there are future
commits planned to address this.
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Link: https://lore.kernel.org/r/20210422180834.2242353-1-leah.rumancik@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Eric has noticed that after pagecache read rework, generic/418 is
occasionally failing for ext4 when blocksize < pagesize. In fact, the
pagecache rework just made hard to hit race in ext4 more likely. The
problem is that since ext4 conversion of direct IO writes to iomap
framework (commit 378f32bab371), we update inode size after direct IO
write only after invalidating page cache. Thus if buffered read sneaks
at unfortunate moment like:
CPU1 - write at offset 1k CPU2 - read from offset 0
iomap_dio_rw(..., IOMAP_DIO_FORCE_WAIT);
ext4_readpage();
ext4_handle_inode_extension()
the read will zero out tail of the page as it still sees smaller inode
size and thus page cache becomes inconsistent with on-disk contents with
all the consequences.
Fix the problem by moving inode size update into end_io handler which
gets called before the page cache is invalidated.
Reported-and-tested-by: Eric Whitney <enwlinux@gmail.com>
Fixes: 378f32bab371 ("ext4: introduce direct I/O write using iomap infrastructure")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Dave Chinner <dchinner@redhat.com>
Link: https://lore.kernel.org/r/20210415155417.4734-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
statx(2) notes that any attribute that is not indicated as supported
by stx_attributes_mask has no usable value. Commits 801e523796004
("fs: move generic stat response attr handling to vfs_getattr_nosec")
and 712b2698e4c02 ("fs/stat: Define DAX statx attribute") sets
STATX_ATTR_AUTOMOUNT and STATX_ATTR_DAX, respectively, without setting
stx_attributes_mask, which can cause xfstests generic/532 to fail.
Fix this in the same way as commit 1b9598c8fb99 ("xfs: fix reporting
supported extra file attributes for statx()")
Fixes: 801e523796004 ("fs: move generic stat response attr handling to vfs_getattr_nosec")
Fixes: 712b2698e4c02 ("fs/stat: Define DAX statx attribute")
Cc: stable@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is needed to allow generic/607 to pass for file systems with the
inline data_feature enabled, and it allows the use of file systems
where the directories use inline_data, while the files are accessed
via DAX.
Cc: stable@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Using no_printk() for jbd_debug() revealed two warnings:
fs/jbd2/recovery.c: In function 'fc_do_one_pass':
fs/jbd2/recovery.c:256:30: error: format '%d' expects a matching 'int' argument [-Werror=format=]
256 | jbd_debug(3, "Processing fast commit blk with seq %d");
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fs/ext4/fast_commit.c: In function 'ext4_fc_replay_add_range':
fs/ext4/fast_commit.c:1732:30: error: format '%d' expects argument of type 'int', but argument 2 has type 'long unsigned int' [-Werror=format=]
1732 | jbd_debug(1, "Converting from %d to %d %lld",
The first one was added incorrectly, and was also missing a few newlines
in debug output, and the second one happened when the type of an
argument changed.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: d556435156b7 ("jbd2: avoid -Wempty-body warnings")
Fixes: 6db074618969 ("ext4: use BIT() macro for BH_** state bits")
Fixes: 5b849b5f96b4 ("jbd2: fast commit recovery path")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210409201211.1866633-1-arnd@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Made suggested modifications from checkpatch in reference to ERROR:
trailing whitespace
Signed-off-by: Jack Qiu <jack.qiu@huawei.com>
Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/20210409042035.15516-1-jack.qiu@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Link: https://lore.kernel.org/r/cover.1616840203.git.unixbhaskar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In case of if not ext4_fc_add_tlv branch, an error return code is missing.
Cc: stable@kernel.org
Fixes: aa75f4d3daae ("ext4: main fast-commit commit path")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xu Yihang <xuyihang@huawei.com>
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20210408070033.123047-1-xuyihang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Assertion checks in jbd2_journal_dirty_metadata() are known to be racy
but we don't want to be grabbing locks just for them. We thus recheck
them under b_state_lock only if it looks like they would fail. Annotate
the checks with data_race().
Cc: stable@kernel.org
Reported-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210406161804.20150-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Access to journal->j_running_transaction is not protected by appropriate
lock and thus is racy. We are well aware of that and the code handles
the race properly. Just add a comment and data_race() annotation.
Cc: stable@kernel.org
Reported-by: syzbot+30774a6acf6a2cf6d535@syzkaller.appspotmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210406161804.20150-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fix As write_mmp_block() so that it returns -EIO instead of 1, so that
the correct error gets saved into the superblock.
Cc: stable@kernel.org
Fixes: 54d3adbc29f0 ("ext4: save all error info in save_error_info() and drop ext4_set_errno()")
Reported-by: Liu Zhi Qiang <liuzhiqiang26@huawei.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20210406025331.148343-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We should set the error code when ext4_commit_super check argument failed.
Found in code review.
Fixes: c4be0c1dc4cdc ("filesystem freeze: add error handling of write_super_lockfs/unlockfs").
Cc: stable@kernel.org
Signed-off-by: Fengnan Chang <changfengnan@vivo.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20210402101631.561-1-changfengnan@vivo.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Before commit 014c9caa29d3 ("ext4: make ext4_abort() use
__ext4_error()"), the following series of commands would trigger a
panic:
1. mount /dev/sda -o ro,errors=panic test
2. mount /dev/sda -o remount,abort test
After commit 014c9caa29d3, remounting a file system using the test
mount option "abort" will no longer trigger a panic. This commit will
restore the behaviour immediately before commit 014c9caa29d3.
(However, note that the Linux kernel's behavior has not been
consistent; some previous kernel versions, including 5.4 and 4.19
similarly did not panic after using the mount option "abort".)
This also makes a change to long-standing behaviour; namely, the
following series commands will now cause a panic, when previously it
did not:
1. mount /dev/sda -o ro,errors=panic test
2. echo test > /sys/fs/ext4/sda/trigger_fs_error
However, this makes ext4's behaviour much more consistent, so this is
a good thing.
Cc: stable@kernel.org
Fixes: 014c9caa29d3 ("ext4: make ext4_abort() use __ext4_error()")
Signed-off-by: Ye Bin <yebin10@huawei.com>
Link: https://lore.kernel.org/r/20210401081903.3421208-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The buffer uptodate state has been checked in function set_buffer_uptodate,
there is no need use buffer_uptodate before calling set_buffer_uptodate and
delete it.
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Yang Guo <guoyang2@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/1617260610-29770-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When CONFIG_QUOTA is enabled, if we failed to mount the filesystem due
to some error happens behind ext4_orphan_cleanup(), it will end up
triggering a after free issue of super_block. The problem is that
ext4_orphan_cleanup() will set SB_ACTIVE flag if CONFIG_QUOTA is
enabled, after we cleanup the truncated inodes, the last iput() will put
them into the lru list, and these inodes' pages may probably dirty and
will be write back by the writeback thread, so it could be raced by
freeing super_block in the error path of mount_bdev().
After check the setting of SB_ACTIVE flag in ext4_orphan_cleanup(), it
was used to ensure updating the quota file properly, but evict inode and
trash data immediately in the last iput does not affect the quotafile,
so setting the SB_ACTIVE flag seems not required[1]. Fix this issue by
just remove the SB_ACTIVE setting.
[1] https://lore.kernel.org/linux-ext4/99cce8ca-e4a0-7301-840f-2ace67c551f3@huawei.com/T/#m04990cfbc4f44592421736b504afcc346b2a7c00
Cc: stable@kernel.org
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Tested-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210331033138.918975-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Block bitmap prefetching is needed for these allocator optimization
data structures to get populated and provide better group scanning
order. So, turn it on bu default. prefetch_block_bitmaps mount option
is now marked as removed and a new option no_prefetch_block_bitmaps is
added to disable block bitmap prefetching.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20210401172129.189766-8-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch adds a new file "mb_structs_summary" which allows us to see
the summary of the new allocator structures added in this
series. Here's the sample output of file:
optimize_scan: 1
max_free_order_lists:
list_order_0_groups: 0
list_order_1_groups: 0
list_order_2_groups: 0
list_order_3_groups: 0
list_order_4_groups: 0
list_order_5_groups: 0
list_order_6_groups: 0
list_order_7_groups: 0
list_order_8_groups: 0
list_order_9_groups: 0
list_order_10_groups: 0
list_order_11_groups: 0
list_order_12_groups: 0
list_order_13_groups: 40
fragment_size_tree:
tree_min: 16384
tree_max: 32768
tree_nodes: 40
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20210401172129.189766-7-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Instead of traversing through groups linearly, scan groups in specific
orders at cr 0 and cr 1. At cr 0, we want to find groups that have the
largest free order >= the order of the request. So, with this patch,
we maintain lists for each possible order and insert each group into a
list based on the largest free order in its buddy bitmap. During cr 0
allocation, we traverse these lists in the increasing order of largest
free orders. This allows us to find a group with the best available cr
0 match in constant time. If nothing can be found, we fallback to cr 1
immediately.
At CR1, the story is slightly different. We want to traverse in the
order of increasing average fragment size. For CR1, we maintain a rb
tree of groupinfos which is sorted by average fragment size. Instead
of traversing linearly, at CR1, we traverse in the order of increasing
average fragment size, starting at the most optimal group. This brings
down cr 1 search complexity to log(num groups).
For cr >= 2, we just perform the linear search as before. Also, in
case of lock contention, we intermittently fallback to linear search
even in CR 0 and CR 1 cases. This allows us to proceed during the
allocation path even in case of high contention.
There is an opportunity to do optimization at CR2 too. That's because
at CR2 we only consider groups where bb_free counter (number of free
blocks) is greater than the request extent size. That's left as future
work.
All the changes introduced in this patch are protected under a new
mount option "mb_optimize_scan".
With this patchset, following experiment was performed:
Created a highly fragmented disk of size 65TB. The disk had no
contiguous 2M regions. Following command was run consecutively for 3
times:
time dd if=/dev/urandom of=file bs=2M count=10
Here are the results with and without cr 0/1 optimizations introduced
in this patch:
|---------+------------------------------+---------------------------|
| | Without CR 0/1 Optimizations | With CR 0/1 Optimizations |
|---------+------------------------------+---------------------------|
| 1st run | 5m1.871s | 2m47.642s |
| 2nd run | 2m28.390s | 0m0.611s |
| 3rd run | 2m26.530s | 0m1.255s |
|---------+------------------------------+---------------------------|
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20210401172129.189766-6-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
A few arrays in mballoc.c use the total number of valid orders as
their size. Currently, this value is set as "sb->s_blocksize_bits +
2". This makes code harder to read. So, instead add a new macro
MB_NUM_ORDERS(sb) to make the code more readable.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20210401172129.189766-5-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add new stats for measuring the performance of mballoc. This patch is
forked from Artem Blagodarenko's work that can be found here:
https://github.com/lustre/lustre-release/blob/master/ldiskfs/kernel_patches/patches/rhel8/ext4-simple-blockalloc.patch
This patch reorganizes the stats by cr level. This is how the output
looks like:
mballoc:
reqs: 0
success: 0
groups_scanned: 0
cr0_stats:
hits: 0
groups_considered: 0
useless_loops: 0
bad_suggestions: 0
cr1_stats:
hits: 0
groups_considered: 0
useless_loops: 0
bad_suggestions: 0
cr2_stats:
hits: 0
groups_considered: 0
useless_loops: 0
cr3_stats:
hits: 0
groups_considered: 0
useless_loops: 0
extents_scanned: 0
goal_hits: 0
2^n_hits: 0
breaks: 0
lost: 0
buddies_generated: 0/40
buddies_time_used: 0
preallocated: 0
discarded: 0
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20210401172129.189766-4-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Before this patch, the function parse_options() was returning
journal_devnum and journal_ioprio variables to the caller. This patch
generalizes that interface to allow parse_options to return any parsed
options to return back to the caller. In this patch series, it gets
used to capture the value of "mb_optimize_scan=%u" mount option.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20210401172129.189766-3-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
s_mb_buddies_generated gets used later in this patch series to
determine if the cr 0 and cr 1 optimziations should be performed or
not. Currently, s_mb_buddies_generated is protected under a
spin_lock. In the allocation path, it is better if we don't depend on
the lock and instead read the value atomically. In order to do that,
we drop s_bal_lock altogether and we convert the only two protected
fields by it s_mb_buddies_generated and s_mb_generation_time to atomic
type.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20210401172129.189766-2-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit <50122847007> ("ext4: fix check to prevent initializing reserved
inodes") check the block group zero and prevent initializing reserved
inodes. But in some special cases, the reserved inode may not all belong
to the group zero, it may exist into the second group if we format
filesystem below.
mkfs.ext4 -b 4096 -g 8192 -N 1024 -I 4096 /dev/sda
So, it will end up triggering a false positive report of a corrupted
file system. This patch fix it by avoid check reserved inodes if no free
inode blocks will be zeroed.
Cc: stable@kernel.org
Fixes: 50122847007 ("ext4: fix check to prevent initializing reserved inodes")
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Suggested-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210331121516.2243099-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Building with 'make W=1' shows a harmless -Wempty-body warning:
fs/jbd2/recovery.c: In function 'fc_do_one_pass':
fs/jbd2/recovery.c:267:75: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
267 | jbd_debug(3, "Fast commit replay failed, err = %d\n", err);
| ^
Change the empty dprintk() macros to no_printk(), which avoids this
warning and adds format string checking.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210322102152.95684-1-arnd@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Matching names with casefolded encrypting directories requires
decrypting entries to confirm case since we are case preserving. We can
avoid needing to decrypt if our hash values don't match.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Link: https://lore.kernel.org/r/20210319073414.1381041-3-drosen@google.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This adds support for encryption with casefolding.
Since the name on disk is case preserving, and also encrypted, we can no
longer just recompute the hash on the fly. Additionally, to avoid
leaking extra information from the hash of the unencrypted name, we use
siphash via an fscrypt v2 policy.
The hash is stored at the end of the directory entry for all entries
inside of an encrypted and casefolded directory apart from those that
deal with '.' and '..'. This way, the change is backwards compatible
with existing ext4 filesystems.
[ Changed to advertise this feature via the file:
/sys/fs/ext4/features/encrypted_casefold -- TYT ]
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20210319073414.1381041-2-drosen@google.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Removes braces to follow the coding style.
Signed-off-by: Milan Djurovic <mdjurovic@zohomail.com>
Link: https://lore.kernel.org/r/20210316052953.67616-1-mdjurovic@zohomail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
A number of tracepoint instances have been removed from ext4 by past
patches but the definitions of those tracepoints have not.
All instances of ext4_ext_in_cache and ext4_ext_put_in_cache were
removed by commit 69eb33dc24dc ("ext4: remove single extent cache").
ext4_get_reserved_cluster_alloc was removed by commit b6bf9171ef5c
("ext4: reduce reserved cluster count by number of allocated
clusters"). ext4_find_delalloc_range was removed by commit
7d1b1fbc95eb ("ext4: reimplement ext4_find_delay_alloc_range on extent
status tree").
All instances of ext4_direct_IO_enter and ext4_direct_IO_exit were
removed by commit 378f32bab371 ("ext4: introduce direct I/O write
using iomap infrastructure").
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Link: https://lore.kernel.org/r/20210216191634.20957-1-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Some members of transaction_t are allowed to be read without any lock
being held if accessed from the correct context. We used LockDoc's
findings to determine those members. Each member of them is marked
with a short comment: "no lock needed for jbd2 thread".
Signed-off-by: Alexander Lochmann <alexander.lochmann@tu-dortmund.de>
Signed-off-by: Horst Schirmeier <horst.schirmeier@tu-dortmund.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210211171410.17984-1-alexander.lochmann@tu-dortmund.de
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Some members of transaction_t are allowed to be read without any lock
being held if consistency doesn't matter. Based on LockDoc's
findings, we extended the locking documentation of those members.
Each one of them is marked with a short comment: "no lock for quick
racy checks".
Signed-off-by: Alexander Lochmann <alexander.lochmann@tu-dortmund.de>
Signed-off-by: Horst Schirmeier <horst.schirmeier@tu-dortmund.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/ad82c7a9-a624-4ed5-5ada-a6410c44c0b3@tu-dortmund.de
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210207190425.38107-7-chaitanya.kulkarni@wdc.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210207190425.38107-6-chaitanya.kulkarni@wdc.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
"This includes more dlm networking cleanups and improvements for making
dlm shutdowns more robust"
* tag 'dlm-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
fs: dlm: fix missing unlock on error in accept_from_sock()
fs: dlm: add shutdown hook
fs: dlm: flush swork on shutdown
fs: dlm: remove unaligned memory access handling
fs: dlm: check on minimum msglen size
fs: dlm: simplify writequeue handling
fs: dlm: use GFP_ZERO for page buffer
fs: dlm: change allocation limits
fs: dlm: add check if dlm is currently running
fs: dlm: add errno handling to check callback
fs: dlm: set subclass for othercon sock_mutex
fs: dlm: set connected bit after accept
fs: dlm: fix mark setting deadlock
fs: dlm: fix debugfs dump
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add the missing unlock before return from accept_from_sock()
in the error handling case.
Fixes: 6cde210a9758 ("fs: dlm: add helper for init connection")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|