| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: correctly decrement the extent buffer index in xfs_bmap_del_extent
xfs: check for valid indices in xfs_iext_get_ext and xfs_iext_idx_to_irec
xfs: fix up asserts in xfs_iflush_fork
xfs: do not do pointer arithmetic on extent records
xfs: do not use unchecked extent indices in xfs_bunmapi
xfs: do not use unchecked extent indices in xfs_bmapi
xfs: do not use unchecked extent indices in xfs_bmap_add_extent_*
xfs: remove if_lastex
xfs: remove the unused XFS_BMAPI_RSVBLOCKS flag
xfs: do not discard alloc btree blocks
xfs: add online discard support
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The code in xfs_bmap_del_extent does not correctly decrement the
extent buffer index when deleting a whole extent. Most of the time
this gets caught by checks in xfs_bmapi that work around it and
decrement it manually and thus wasn't noticed so far.
Based on an earlier patch from Lachlan McIlroy.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lachlan McIlroy <lmcilroy@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Based on an earlier patch from Lachlan McIlroy.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lachlan McIlroy <lmcilroy@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Remove asserts in xfs_iflush_fork that would call xfs_iext_get_ext
with a potentially invalid extent buffer index.
Based on an earlier patch from Lachlan McIlroy.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lachlan McIlroy <lmcilroy@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We need to call xfs_iext_get_ext for the previous extent to get a
valid pointer, and can't just do pointer arithmetics as they might
be in different pages.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lachlan McIlroy <lmcilroy@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Make sure to only call xfs_iext_get_ext after we've validate the
extent index when moving on to the next index in xfs_bunmapi. Also
remove the old workaround for too large indices that has been
superceeded by the proper fix in xfs_bmap_del_extent.
Based on an earlier patch from Lachlan McIlroy.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lachlan McIlroy <lmcilroy@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Make sure to only call xfs_iext_get_ext after we've validate the
extent index when moving on to the next index in xfs_bmapi.
Based on an earlier patch from Lachlan McIlroy.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lachlan McIlroy <lmcilroy@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Make sure to only call xfs_iext_get_ext after we've validate the
extent index in the various xfs_bmap_add_extent_* helpers.
Based on an earlier patch from Lachlan McIlroy.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lachlan McIlroy <lmcilroy@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The if_lastex field in struct xfs_ifork is only used as a temporary
index during xfs_bmapi and xfs_bunmapi. Instead of using the inode
fork to store it keep it local in the callchain. Fortunately this
is very easy as we already pass a stack copy of it down the whole
chain which can simplify be changed to be passed by reference.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The XFS_BMAPI_RSVBLOCKS is unused, and as far as I can see has
always been. Remove it to simplify the bmapi implementation and
conserve stack space.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Blocks for the allocation btree are allocated from and released to
the AGFL, and thus frequently reused. Even worse we do not have an
easy way to avoid using an AGFL block when it is discarded due to
the simple FILO list of free blocks, and thus can frequently stall
on blocks that are currently undergoing a discard.
Add a flag to the busy extent tracking structure to skip the discard
for allocation btree blocks. In normal operation these blocks are
reused frequently enough that there is no need to discard them
anyway, but if they spill over to the allocation btree as part of a
balance we "leak" blocks that we would otherwise discard. We could
fix this by adding another flag and keeping these block in the
rbtree even after they aren't busy any more so that we could discard
them when they migrate out of the AGFL. Given that this would cause
significant overhead I don't think it's worthwile for now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Now that we have reliably tracking of deleted extents in a
transaction we can easily implement "online" discard support
which calls blkdev_issue_discard once a transaction commits.
The actual discard is a two stage operation as we first have
to mark the busy extent as not available for reuse before we
can start the actual discard. Note that we don't bother
supporting discard for the non-delaylog mode.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (61 commits)
jbd2: Add MAINTAINERS entry
jbd2: fix a potential leak of a journal_head on an error path
ext4: teach ext4_ext_split to calculate extents efficiently
ext4: Convert ext4 to new truncate calling convention
ext4: do not normalize block requests from fallocate()
ext4: enable "punch hole" functionality
ext4: add "punch hole" flag to ext4_map_blocks()
ext4: punch out extents
ext4: add new function ext4_block_zero_page_range()
ext4: add flag to ext4_has_free_blocks
ext4: reserve inodes and feature code for 'quota' feature
ext4: add support for multiple mount protection
ext4: ensure f_bfree returned by ext4_statfs() is non-negative
ext4: protect bb_first_free in ext4_trim_all_free() with group lock
ext4: only load buddy bitmap in ext4_trim_fs() when it is needed
jbd2: Fix comment to match the code in jbd2__journal_start()
ext4: fix waiting and sending of a barrier in ext4_sync_file()
jbd2: Add function jbd2_trans_will_send_data_barrier()
jbd2: fix sending of data flush on journal commit
ext4: fix ext4_ext_fiemap_cb() to handle blocks before request range correctly
...
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Create a separate MAINTAINERS entry for jbd2
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
drop jh->b_jcount in error path
Signed-off-by: Ding Dinghua <dingdinghua@nrchpc.ac.cn>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Make ext4_ext_split() get extents to be moved by calculating in a statement
instead of counting in a loop.
Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Trivial conversion. Fixup one error handling case calling vmtruncate()
and remove ->truncate callback. We also fix a bug that IS_IMMUTABLE and
IS_APPEND files could not be truncated during failed writes. In fact, the
test can be completely removed as upper layers do necessary permission
checks for truncate in do_sys_[f]truncate() and may_open() anyway.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Currently, an fallocate request of size slightly larger than a power of
2 is turned into two block requests, each a power of 2, with the extra
blocks pre-allocated for future use. When an application calls
fallocate, it already has an idea about how large the file may grow so
there is usually little benefit to reserve extra blocks on the
preallocation list. This reduces disk fragmentation.
Tested: fsstress. Also verified manually that fallocat'ed files are
contiguously laid out with this change (whereas without it they begin at
power-of-2 boundaries, leaving blocks in between). CPU usage of
fallocate is not appreciably higher. In a tight fallocate loop, CPU
usage hovers between 5%-8% with this change, and 5%-7% without it.
Using a simulated file system aging program which the file system to
70%, the percentage of free extents larger than 8MB (as measured by
e2freefrag) increased from 38.8% without this change, to 69.4% with
this change.
Signed-off-by: Vivek Haldar <haldar@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch adds new routines: "ext4_punch_hole" "ext4_ext_punch_hole"
and "ext4_ext_check_cache"
fallocate has been modified to call ext4_punch_hole when the punch hole
flag is passed. At the moment, we only support punching holes in
extents, so this routine is pretty much a wrapper for the ext4_ext_punch_hole
routine.
The ext4_ext_punch_hole routine first completes all outstanding writes
with the associated pages, and then releases them. The unblock
aligned data is zeroed, and all blocks in between are punched out.
The ext4_ext_check_cache routine is very similar to ext4_ext_in_cache
except it accepts a ext4_ext_cache parameter instead of a ext4_extent
parameter. This routine is used by ext4_ext_punch_hole to check and
see if a block in a hole that has been cached. The ext4_ext_cache
parameter is necessary because the members ext4_extent structure are
not large enough to hold a 32 bit value. The existing
ext4_ext_in_cache routine has become a wrapper to this new function.
[ext4 punch hole patch series 5/5 v7]
Signed-off-by: Allison Henderson <achender@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Mingming Cao <cmm@us.ibm.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch adds a new flag to ext4_map_blocks() that specifies the
given range of blocks should be punched out. Extents are first
converted to uninitialized extents before they are punched
out. Because punching a hole may require that the extent be split, it
is possible that the splitting may need more blocks than are
available. To deal with this, use of reserved blocks are enabled to
allow the split to proceed.
The routine then returns the number of blocks successfully
punched out.
[ext4 punch hole patch series 4/5 v7]
Signed-off-by: Allison Henderson <achender@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Mingming Cao <cmm@us.ibm.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch modifies the truncate routines to support hole punching
Below is a brief summary of the patches changes:
- Added end param to ext_ext4_rm_leaf
This function has been modified to accept an end parameter
which enables it to punch holes in leafs instead of just
truncating them.
- Implemented the "remove head" case in the ext_remove_blocks routine
This routine is used by ext_ext4_rm_leaf to remove the tail
of an extent during a truncate. The new ext_ext4_rm_leaf
routine will now also use it to remove the head of an extent in the
case that the hole covers a region of blocks at the beginning
of an extent.
- Added "end" param to ext4_ext_remove_space routine
This function has been modified to accept a stop parameter, which
is passed through to ext4_ext_rm_leaf.
[ext4 punch hole patch series 3/5 v6]
Signed-off-by: Allison Henderson <achender@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch modifies the existing ext4_block_truncate_page() function
which was used by the truncate code path, and which zeroes out block
unaligned data, by adding a new length parameter, and renames it to
ext4_block_zero_page_rage(). This function can now be used to zero out the
head of a block, the tail of a block, or the middle
of a block.
The ext4_block_truncate_page() function is now a wrapper to
ext4_block_zero_page_range().
[ext4 punch hole patch series 2/5 v7]
Signed-off-by: Allison Henderson <achender@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Mingming Cao <cmm@us.ibm.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch adds an allocation request flag to the ext4_has_free_blocks
function which enables the use of reserved blocks. This will allow a
punch hole to proceed even if the disk is full. Punching a hole may
require additional blocks to first split the extents.
Because ext4_has_free_blocks is a low level function, the flag needs
to be passed down through several functions listed below:
ext4_ext_insert_extent
ext4_ext_create_new_leaf
ext4_ext_grow_indepth
ext4_ext_split
ext4_ext_new_meta_block
ext4_mb_new_blocks
ext4_claim_free_blocks
ext4_has_free_blocks
[ext4 punch hole patch series 1/5 v7]
Signed-off-by: Allison Henderson <achender@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Mingming Cao <cmm@us.ibm.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
I am working on patch to add quota as a built-in feature for ext4
filesystem. The implementation is based on the design given at
https://ext4.wiki.kernel.org/index.php/Design_For_1st_Class_Quota_in_Ext4.
This patch reserves the inode numbers 3 and 4 for quota purposes and
also reserves EXT4_FEATURE_RO_COMPAT_QUOTA feature code.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Prevent an ext4 filesystem from being mounted multiple times.
A sequence number is stored on disk and is periodically updated (every 5
seconds by default) by a mounted filesystem.
At mount time, we now wait for s_mmp_update_interval seconds to make sure
that the MMP sequence does not change.
In case of failure, the nodename, bdevname and the time at which the MMP
block was last updated is displayed.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
I found the issue that the number of free blocks went negative.
# stat -f /mnt/mp1/
File: "/mnt/mp1/"
ID: e175ccb83a872efe Namelen: 255 Type: ext2/ext3
Block size: 4096 Fundamental block size: 4096
Blocks: Total: 258022 Free: -15 Available: -13122
Inodes: Total: 65536 Free: 63029
f_bfree in struct statfs will go negative when the filesystem has
few free blocks. Because the number of dirty blocks is bigger than
the number of free blocks in the following two cases.
CASE 1:
ext4_da_writepages
mpage_da_map_and_submit
ext4_map_blocks
ext4_ext_map_blocks
ext4_mb_new_blocks
ext4_mb_diskspace_used
percpu_counter_sub(&sbi->s_freeblocks_counter, ac->ac_b_ex.fe_len);
<--- interrupt statfs systemcall --->
ext4_da_update_reserve_space
percpu_counter_sub(&sbi->s_dirtyblocks_counter,
used + ei->i_allocated_meta_blocks);
CASE 2:
ext4_write_begin
__block_write_begin
ext4_map_blocks
ext4_ext_map_blocks
ext4_mb_new_blocks
ext4_mb_diskspace_used
percpu_counter_sub(&sbi->s_freeblocks_counter, ac->ac_b_ex.fe_len);
<--- interrupt statfs systemcall --->
percpu_counter_sub(&sbi->s_dirtyblocks_counter, reserv_blks);
To avoid the issue, this patch ensures that f_bfree is non-negative.
Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We should protect reading bd_info->bb_first_free with the group lock
because otherwise we might miss some free blocks. This is not a big deal
at all, but the change to do right thing is really simple, so lets do
that.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Currently we are loading buddy ext4_mb_load_buddy() for every block
group we are going through in ext4_trim_fs() in many cases just to find
out that there is not enough space to be bothered with. As Amir Goldstein
suggested we can use bb_free information directly from ext4_group_info.
This commit removes ext4_mb_load_buddy() from ext4_trim_fs() and rather
get the ext4_group_info via ext4_get_group_info() and use the bb_free
information directly from that. This avoids unnecessary call to load
buddy in the case the group does not have enough free space to trim.
Loading buddy is now moved to ext4_trim_all_free().
Tested by me with xfstests 251.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
jbd2__journal_start() returns an ERR_PTR() value rather than NULL on
failure.
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
jbd2_log_start_commit() returns 1 only when we really start a
transaction. But we also need to wait for a transaction when the
commit is already running. Fix this problem by waiting for
transaction commit unconditionally (which is just a quick check if the
transaction is already committed).
Also we have to be more careful with sending of a barrier because when
transaction is being committed in parallel to ext4_sync_file()
running, we cannot be sure that the barrier the journalling code sends
happens after we wrote all the data for fsync (note that not every
data writeout needs to trigger metadata changes thus commit of some
metadata changes can be running while other data is still written
out). So use jbd2_will_send_data_barrier() helper to detect the common
cases when we can be sure barrier will be issued by the commit code
and issue the barrier ourselves in the remaining cases.
Reported-by: Edward Goggin <egoggin@vmware.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Provide a function which returns whether a transaction with given tid
will send a flush to the filesystem device. The function will be used
by ext4 to detect whether fsync needs to send a separate flush or not.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In data=ordered mode, it's theoretically possible (however rare) that
an inode is filed to transaction's t_inode_list and a flusher thread
writes all the data and inode is reclaimed before the transaction
starts to commit. In such a case, we could erroneously omit sending a
flush to file system device when it is different from the journal
device (because data can still be in disk cache only).
Fix the problem by setting a flag in a transaction when some inode is added
to it and then send disk flush in the commit code when the flag is set.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
To get delayed-extent information, ext4_ext_fiemap_cb() looks up
pagecache, it thus collects information starting from a page's
head block.
If blocksize < pagesize, the beginning blocks of a page may lies
before the request range. So ext4_ext_fiemap_cb() should proceed
ignoring them, because they has been handled before. If no mapped
buffer in the range is found in the 1st page, we need to look up
the 2nd page, otherwise delayed-extents after a hole will be ignored.
Without this patch, xfstests 225 will hung on ext4 with 1K block.
Reported-by: Amir Goldstein <amir73il@users.sourceforge.net>
Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In commit c8d46e41 (ext4: Add flag to files with blocks intentionally
past EOF), if the EOFBLOCKS_FL flag is set, we call ext4_truncate()
before calling vmtruncate(). This caused any allocated but unwritten
blocks created by calling fallocate() with the FALLOC_FL_KEEP_SIZE
flag to be dropped. This was done to make to make sure that
EOFBLOCKS_FL would not be cleared while still leaving blocks past
i_size allocated. This was not necessary, since ext4_truncate()
guarantees that blocks past i_size will be dropped, even in the case
where truncate() has increased i_size before calling ext4_truncate().
So fix this by removing the EOFBLOCKS_FL special case treatment in
ext4_setattr(). In addition, use truncate_setsize() followed by a
call to ext4_truncate() instead of using vmtruncate(). This is more
efficient since it skips the call to inode_newsize_ok(), which has
been checked already by inode_change_ok(). This is also in a win in
the case where EOFBLOCKS_FL is set since it avoids calling
ext4_truncate() twice.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
t_max_wait is added in commit 8e85fb3f to indicate how long we
were waiting for new transaction to start. In commit 6d0bf005,
it is moved to another function named update_t_max_wait to
avoid a build warning. But the wrong thing is that the original
'ts' is initialized in the start of function start_this_handle
and we can calculate t_max_wait in the right way. while with
this change, ts is initialized within the function and t_max_wait
can never be calculated right.
This patch moves the initialization of ts to the original beginning
of start_this_handle and pass it to function update_t_max_wait so
that it can be calculated right and the build warning is avoided also.
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
ext4_ext_truncate() should not invoke up_write(&EXT4_I(inode)->i_data_sem)
when ext4_orphan_add() returns an error, as it hasn't performed a
down_write() yet. This trivial patch fixes this by moving the up_write()
invocation above the out_stop label.
Signed-off-by: Eric Gouriou <egouriou@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The number of hits and misses for each filesystem is exposed in
/sys/fs/ext4/<dev>/extent_cache_{hits, misses}.
Tested: fsstress, manual checks.
Signed-off-by: Vivek Haldar <haldar@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | | |
Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Mingming Cao <cmm@us.ibm.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
After creating an ext4 file system without a journal:
# mke2fs -t ext4 -O ^has_journal /dev/sda
# mount -t ext4 /dev/sda /test
the /proc/mounts will show:
"/dev/sda /test ext4 rw,relatime,user_xattr,acl,barrier=1,data=writeback 0 0"
which can fool users into thinking that the fs is using writeback mode.
So don't set the writeback option when the journal has not been
enabled; we don't depend on the writeback option being set, since
ext4_should_writeback_data() in ext4_jbd2.h tests to see if the
journal is not present before returning true.
Reported-by: Robin Dong <sanbai@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We need to take reference to the s_li_request after we take a mutex,
because it might be freed since then, hence result in accessing old
already freed memory. Also we should protect the whole
ext4_remove_li_request() because ext4_li_info might be in the process of
being freed in ext4_lazyinit_thread().
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
For some reason, when we set the mount option "init_itable=0" it
behaves as we would set init_itable=20 which is not right at all.
Basically when we set it to zero we are saying to lazyinit thread not
to wait between zeroing the inode table (except of cond_resched()) so
this commit fixes that and removes the unnecessary condition. The 'n'
should be also properly used on remount.
When the n is not set at all, it means that the default miltiplier
EXT4_DEF_LI_WAIT_MULT is set instead.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Eric Sandeen <sandeen@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
For some reason we have been waiting for lazyinit thread to start in the
ext4_run_lazyinit_thread() but it is not needed since it was jus
unnecessary complexity, so get rid of it. We can also remove li_task and
li_wait_task since it is not used anymore.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In order to make lazyinit eat approx. 10% of io bandwidth at max, we
are sleeping between zeroing each single inode table. For that purpose
we are using timer which wakes up thread when it expires. It is set
via add_timer() and this may cause troubles in the case that thread
has been woken up earlier and in next iteration we call add_timer() on
still running timer hence hitting BUG_ON in add_timer(). We could fix
that by using mod_timer() instead however we can use
schedule_timeout_interruptible() for waiting and hence simplifying
things a lot.
This commit exchange the old "waiting mechanism" with simple
schedule_timeout_interruptible(), setting the time to sleep. Hence we
do not longer need li_wait_daemon waiting queue and others, so get rid
of it.
Addresses-Red-Hat-Bugzilla: #699708
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In order to stabilize pages during disk writes, ext4_page_mkwrite must
wait for writeback operations to complete before making a page
writable. Furthermore, the function must return locked pages, and
recheck the writeback status if the page lock is ever dropped. The
"someone could wander in" part of this patch was suggested by Chris
Mason.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
wait_on_page_writeback already checks the writeback bit, so callers of it
needn't do that test.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Currently, if we mkfs a new ext4 volume with s_max_mnt_count set to
zero, and mount it for the first time, we will get the warning:
maximal mount count reached, running e2fsck is recommended
It is really misleading. So change the check so that it won't warn in
that case.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch addresses bugs found while testing punch hole
with the fsx test. The patch corrects the number of blocks
that are zeroed out while splitting an extent, and also corrects
the return value to return the number of blocks split out, instead
of the number of blocks zeroed out.
This patch has been tested in addition to the following patches:
[Ext4 punch hole v7]
[XFS Tests Punch Hole 1/1 v2] Add Punch Hole Testing to FSX
The test ran successfully for 24 hours.
Signed-off-by: Allison Henderson <achender@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If quota is not enabled when ext4_quota_off() is called, we must not
dereference quota file inode since it is NULL. Check properly for
this.
This fixes a bug in commit 21f976975cbe (ext4: remove unnecessary
[cm]time update of quota file), which was merged for 2.6.39-rc3.
Reported-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fix for a null pointer bug found while running punch hole tests
Signed-off-by: Allison Henderson <achender@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
After taking care of all group init races, all that remains is to
remove alloc_semp from ext4_allocation_context and ext4_buddy structs.
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|