summaryrefslogtreecommitdiffstats
path: root/fs/f2fs (follow)
Commit message (Collapse)AuthorAgeFilesLines
* wrappers for ->i_mutex accessAl Viro2016-01-232-12/+12
| | | | | | | | | | | parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* kmemcg: account certain kmem allocations to memcgVladimir Davydov2016-01-151-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark those kmem allocations that are known to be easily triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to memcg. For the list, see below: - threadinfo - task_struct - task_delay_info - pid - cred - mm_struct - vm_area_struct and vm_region (nommu) - anon_vma and anon_vma_chain - signal_struct - sighand_struct - fs_struct - files_struct - fdtable and fdtable->full_fds_bits - dentry and external_name - inode for all filesystems. This is the most tedious part, because most filesystems overwrite the alloc_inode method. The list is far from complete, so feel free to add more objects. Nevertheless, it should be close to "account everything" approach and keep most workloads within bounds. Malevolent users will be able to breach the limit, but this was possible even with the former "account everything" approach (simply because it did not account everything in fact). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'for-f2fs-4.5' of ↵Linus Torvalds2016-01-1419-725/+1214
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This series adds two ioctls to control cached data and fragmented files. Most of the rest fixes missing error cases and bugs that we have not covered so far. Summary: Enhancements: - support an ioctl to execute online file defragmentation - support an ioctl to flush cached data - speed up shrinking of extent_cache entries - handle broken superblock - refector dirty inode management infra - revisit f2fs_map_blocks to handle more cases - reduce global lock coverage - add detecting user's idle time Major bug fixes: - fix data race condition on cached nat entries - fix error cases of volatile and atomic writes" * tag 'for-f2fs-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (87 commits) f2fs: should unset atomic flag after successful commit f2fs: fix wrong memory condition check f2fs: monitor the number of background checkpoint f2fs: detect idle time depending on user behavior f2fs: introduce time and interval facility f2fs: skip releasing nodes in chindless extent tree f2fs: use atomic type for node count in extent tree f2fs: recognize encrypted data in f2fs_fiemap f2fs: clean up f2fs_balance_fs f2fs: remove redundant calls f2fs: avoid unnecessary f2fs_balance_fs calls f2fs: check the page status filled from disk f2fs: introduce __get_node_page to reuse common code f2fs: check node id earily when readaheading node page f2fs: read isize while holding i_mutex in fiemap Revert "f2fs: check the node block address of newly allocated nid" f2fs: cover more area with nat_tree_lock f2fs: introduce max_file_blocks in sbi f2fs crypto: check CONFIG_F2FS_FS_XATTR for encrypted symlink f2fs: introduce zombie list for fast shrinking extent trees ...
| * f2fs: should unset atomic flag after successful commitJaegeuk Kim2016-01-121-1/+3
| | | | | | | | | | | | | | If there is an error during commit, we should keep the flag in order to abort it. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: fix wrong memory condition checkJaegeuk Kim2016-01-121-2/+2
| | | | | | | | | | | | | | | | This patch fixes wrong decision for avaliable_free_memory. The return valus is already set as false, so we should consider true condition below only. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: monitor the number of background checkpointJaegeuk Kim2016-01-123-2/+6
| | | | | | | | | | | | This patch adds to show the number of background checkpoint. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: detect idle time depending on user behaviorJaegeuk Kim2016-01-129-10/+38
| | | | | | | | | | | | | | This patch adds last time that user requested filesystem operations. This information is used to detect whether system is idle or not later. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: introduce time and interval facilityJaegeuk Kim2016-01-124-7/+25
| | | | | | | | | | | | This patch adds time and interval arrays to store some timing variables. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: skip releasing nodes in chindless extent treeChao Yu2016-01-081-4/+9
| | | | | | | | | | | | | | | | If there are no nodes in extent tree, let's skip releasing step to avoid any overhead of grabbing/releasing extent tree lock. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: use atomic type for node count in extent treeChao Yu2016-01-082-9/+10
| | | | | | | | | | | | | | | | | | 1. rename field in struct extent_tree from count to node_cnt for readability. 2. alter to use atomic type for node_cnt. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: recognize encrypted data in f2fs_fiemapChao Yu2016-01-081-1/+5
| | | | | | | | | | | | | | This patch fixes to teach f2fs_fiemap to recognize encrypted data. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: clean up f2fs_balance_fsJaegeuk Kim2016-01-088-38/+33
| | | | | | | | | | | | This patch adds one parameter to clean up all the callers of f2fs_balance_fs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: remove redundant callsJaegeuk Kim2016-01-081-3/+0
| | | | | | | | | | | | This patch removes redundant calls. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: avoid unnecessary f2fs_balance_fs callsJaegeuk Kim2016-01-086-29/+30
| | | | | | | | | | | | | | Only when node page is newly dirtied, it needs to check whether we need to do f2fs_gc. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: check the page status filled from diskJaegeuk Kim2016-01-081-6/+5
| | | | | | | | | | | | After reading a page, we need to check whether there is any error. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: introduce __get_node_page to reuse common codeChao Yu2016-01-081-53/+35
| | | | | | | | | | | | | | | | | | There are duplicated code in between get_node_page and get_node_page_ra, introduce __get_node_page to includes common parts of these two, and export get_node_page and get_node_page_ra by reusing __get_node_page. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: check node id earily when readaheading node pageChao Yu2016-01-081-3/+8
| | | | | | | | | | | | | | Add node id check in ra_node_page and get_node_page_ra like get_node_page. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: read isize while holding i_mutex in fiemapFan Li2016-01-071-1/+3
| | | | | | | | | | | | | | make sure the isize we read doesn't change during the process. Signed-off-by: Fan li <fanofcode.li@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * Revert "f2fs: check the node block address of newly allocated nid"Jaegeuk Kim2016-01-071-9/+0
| | | | | | | | | | | | | | | | Original issue is fixed by: f2fs: cover more area with nat_tree_lock This reverts commit 24928634f81b1592e83b37dcd89ed45c28f12feb.
| * f2fs: cover more area with nat_tree_lockJaegeuk Kim2016-01-071-17/+12
| | | | | | | | | | | | | | | | There was a subtle bug on nat cache management which incurs wrong nid allocation or wrong block addresses when try_to_free_nats is triggered heavily. This patch enlarges the previous coverage of nat_tree_lock to avoid data race. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: introduce max_file_blocks in sbiChao Yu2016-01-043-5/+6
| | | | | | | | | | | | | | | | | | | | Introduce max_file_blocks in sbi to store max block index of file in f2fs, it could be used to avoid unneeded calculation of max block index in runtime. Signed-off-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: fix overflow of sbi->max_file_blocks] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs crypto: check CONFIG_F2FS_FS_XATTR for encrypted symlinkChao Yu2016-01-011-0/+2
| | | | | | | | | | | | | | | | Add missed CONFIG_F2FS_FS_XATTR for encrypted symlink inode in order to avoid unneeded registry of ->{get,set,remove}xattr. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: introduce zombie list for fast shrinking extent treesJaegeuk Kim2016-01-012-29/+23
| | | | | | | | | | | | | | This patch removes refcount, and instead, adds zombie_list to shrink directly without radix tree traverse. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: monitor zombie_tree countJaegeuk Kim2016-01-012-3/+4
| | | | | | | | | | | | This patch adds an entry to show the number of zombie extent_tree. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: use IPU for fdatasyncJaegeuk Kim2015-12-311-1/+1
| | | | | | | | | | | | | | This patch fixes missing IPU condition when fdatasync is called. With this patch, fdatasync is able to avoid additional node writes for recovery. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: write pending bios when cp_error is setJaegeuk Kim2015-12-313-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When testing ioc_shutdown, put_super is able to be hanged by waiting for writebacking pages as follows. INFO: task umount:2723 blocked for more than 120 seconds. Tainted: G O 4.4.0-rc3+ #8 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. umount D ffff88000859f9d8 0 2723 2110 0x00000000 ffff88000859f9d8 0000000000000000 0000000000000000 ffffffff81e11540 ffff880078c225c0 ffff8800085a0000 ffff88007fc17440 7fffffffffffffff ffffffff818239f0 ffff88000859fb48 ffff88000859f9f0 ffffffff8182310c Call Trace: [<ffffffff818239f0>] ? bit_wait+0x50/0x50 [<ffffffff8182310c>] schedule+0x3c/0x90 [<ffffffff81827fb9>] schedule_timeout+0x2d9/0x430 [<ffffffff810e0f8f>] ? mark_held_locks+0x6f/0xa0 [<ffffffff8111614d>] ? ktime_get+0x7d/0x140 [<ffffffff818239f0>] ? bit_wait+0x50/0x50 [<ffffffff8106a655>] ? kvm_clock_get_cycles+0x25/0x30 [<ffffffff8111617c>] ? ktime_get+0xac/0x140 [<ffffffff818239f0>] ? bit_wait+0x50/0x50 [<ffffffff81822564>] io_schedule_timeout+0xa4/0x110 [<ffffffff81823a25>] bit_wait_io+0x35/0x50 [<ffffffff818235bd>] __wait_on_bit+0x5d/0x90 [<ffffffff811b9e8b>] wait_on_page_bit+0xcb/0xf0 [<ffffffff810d5f90>] ? autoremove_wake_function+0x40/0x40 [<ffffffff811cf84c>] truncate_inode_pages_range+0x4bc/0x840 [<ffffffff811cfc3d>] truncate_inode_pages_final+0x4d/0x60 [<ffffffffc023ced5>] f2fs_evict_inode+0x75/0x400 [f2fs] [<ffffffff812639bc>] evict+0xbc/0x190 [<ffffffff81263d19>] iput+0x229/0x2c0 [<ffffffffc0241885>] f2fs_put_super+0x105/0x1a0 [f2fs] [<ffffffff8124756a>] generic_shutdown_super+0x6a/0xf0 [<ffffffff812478f7>] kill_block_super+0x27/0x70 [<ffffffffc0241290>] kill_f2fs_super+0x20/0x30 [f2fs] [<ffffffff81247b03>] deactivate_locked_super+0x43/0x70 [<ffffffff81247f4c>] deactivate_super+0x5c/0x60 [<ffffffff81268d2f>] cleanup_mnt+0x3f/0x90 [<ffffffff81268dc2>] __cleanup_mnt+0x12/0x20 [<ffffffff810ac463>] task_work_run+0x73/0xa0 [<ffffffff810032ac>] exit_to_usermode_loop+0xcc/0xd0 [<ffffffff81003e7c>] syscall_return_slowpath+0xcc/0xe0 [<ffffffff81829ea2>] int_ret_from_sys_call+0x25/0x9f Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: remove f2fs_bug_on in terms of max_depthJaegeuk Kim2015-12-311-2/+8
| | | | | | | | | | | | | | There is no report on this bug_on case, but if malicious attacker changed this field intentionally, we can just reset it as a MAX value. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: fix f2fs_ioc_abort_volatile_writeJaegeuk Kim2015-12-301-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | There are two rules to handle aborting volatile or atomic writes. 1. drop atomic writes - we don't need to keep any stale db data. 2. write journal data - we should keep the journal data with fsync for db recovery. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: fix to skip recovering dot dentries in a readonly fsChao Yu2015-12-301-0/+7
| | | | | | | | | | | | | | | | If filesystem is readonly, leave user message info instead of recovering inline dot inode. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: load largest extent all the timeJaegeuk Kim2015-12-303-7/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise, we can get mismatched largest extent information. One example is: 1. mount f2fs w/ extent_cache 2. make a small extent 3. umount 4. mount f2fs w/o extent_cache 5. update the largest extent 6. umount 7. mount f2fs w/ extent_cache 8. get the old extent made by #2 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: use i_size_read to get i_sizeJaegeuk Kim2015-12-301-3/+4
| | | | | | | | | | | | We need to use i_size_read() to get inode->i_size. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: early check broken symlink length in the encrypted caseJaegeuk Kim2015-12-301-2/+8
| | | | | | | | | | | | If link is broken, its len is zero, and we don't need to move forward. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: clean up f2fs_ioc_write_checkpointChao Yu2015-12-301-9/+1
| | | | | | | | | | | | | | | | Use f2fs_sync_fs to clean up codes in f2fs_ioc_write_checkpoint. Signed-off-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: remove unused err variable] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: add a max block check for get_data_block_bmapYunlei He2015-12-303-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a max block check for get_data_block_bmap. Trinity test program will send a block number as parameter into ioctl_fibmap, which will be used in get_node_path(), when the block number large than f2fs max blocks, it will trigger kernel bug. Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Xue Liu <liuxueliu.liu@huawei.com> [Jaegeuk Kim: fix missing condition, pointed by Chao Yu] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: fix bugs and simplify codes of f2fs_fiemapFan Li2015-12-301-53/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | fix bugs: 1. len could be updated incorrectly when start+len is beyond isize. 2. If there is a hole consisting of more than two blocks, it could fail to add FIEMAP_EXTENT_LAST flag for the last extent. 3. If there is an extent beyond isize, when we search extents in a range that ends at isize, it will also return the extent beyond isize, which is outside the range. Signed-off-by: Fan li <fanofcode.li@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: let user being aware of IO errorChao Yu2015-12-306-23/+32
| | | | | | | | | | | | | | | | | | | | | | Sometimes we keep dumb when IO error occur in lower layer device, so user will not receive any error return value for some operation, but actually, the operation did not succeed. This sould be avoided, so this patch reports such kind of error to user. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: add missing f2fs_balance_fs in __recover_dot_dentriesChao Yu2015-12-301-0/+2
| | | | | | | | | | | | | | | | __recover_do_dentries will try to grab free space in storage, so fix to add missing f2fs_balance_fs here. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: declare static functionJaegeuk Kim2015-12-301-1/+1
| | | | | | | | | | | | The __f2fs_commit_super is static. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: avoid f2fs_lock_op in f2fs_write_beginJaegeuk Kim2015-12-301-8/+34
| | | | | | | | | | | | | | If f2fs_write_begin is to update data, we can bypass calling f2fs_lock_op() in order to avoid the checkpoint latency in the write syscall. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: return early when trying to read null nidJaegeuk Kim2015-12-301-0/+4
| | | | | | | | | | | | | | If get_node_page() gets zero nid, we can return early without getting a wrong page. For example, get_dnode_of_data() can try to do that. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: introduce prepare_write_begin to clean upJaegeuk Kim2015-12-301-38/+54
| | | | | | | | | | | | | | | | This patch adds prepare_write_begin to clean f2fs_write_begin. The major role of this function is to convert any inline_data and allocate or find block address. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: don't convert inline inode when inline_data option is disableChao Yu2015-12-302-4/+1
| | | | | | | | | | | | | | | | | | | | If inline_data option is disable, when truncating an inline inode with size which is not exceed maxinum inline size, we should not convert inline inode to regular one to avoid the overhead of synchronizing conversion. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: report error of do_checkpointChao Yu2015-12-306-18/+30
| | | | | | | | | | | | | | | | | | | | | | do_checkpoint and write_checkpoint can fail due to reasons like triggering in a readonly fs or encountering IO error of storage device. So it's better to report such error info to user, let user be aware of failure of doing checkpoint. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: call f2fs_balance_fs only when node was changedJaegeuk Kim2015-12-303-21/+35
| | | | | | | | | | | | | | | | If user tries to update or read data, we don't need to call f2fs_balance_fs which triggers f2fs_gc, which increases unnecessary long latency. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: reduce covered region of sbi->cp_rwsem in f2fs_map_blocksChao Yu2015-12-301-2/+7
| | | | | | | | | | | | | | | | | | | | Only cover sbi->cp_rwsem on one dnode page's allocation and modification instead of multiple's in f2fs_map_blocks, it can reduce the covered region of cp_rwsem, then we can avoid potential long time delay for concurrent checkpointer. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: record node block allocation in dnode_of_dataJaegeuk Kim2015-12-303-0/+7
| | | | | | | | | | | | | | | | | | This patch introduces recording node block allocation in dnode_of_data. This information helps to figure out whether any node block is allocated during specific file operations. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: avoid unnecessary f2fs_gc for dir operationsJaegeuk Kim2015-12-301-16/+18
| | | | | | | | | | | | | | | | The f2fs_balance_fs doesn't need to cover f2fs_new_inode or f2fs_find_entry works. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: check inline_data flag at converting timeJaegeuk Kim2015-12-303-40/+29
| | | | | | | | | | | | | | We can check inode's inline_data flag when calling to convert it. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: speed up shrinking extent tree entriesJaegeuk Kim2015-12-303-1/+16
| | | | | | | | | | | | | | | | | | If there is no candidates for shrinking slab entries, we don't need to traverse any trees at all. Reviewed-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: fix missing initialization reported by Yunlei He] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: use atomic variable for total_extent_treeJaegeuk Kim2015-12-225-9/+12
| | | | | | | | | | | | | | It would be better to use atomic variable for total_extent_tree. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>