summaryrefslogtreecommitdiffstats
path: root/fs/btrfs/inode-map.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Btrfs: don't use global block reservation for inode cache truncationMiao Xie2013-05-181-2/+3
| | | | | | | | | | | It is very likely that there are lots of subvolumes/snapshots in the filesystem, so if we use global block reservation to do inode cache truncation, we may hog all the free space that is reserved in global rsv. So it is better that we do the free space reservation for inode cache truncation by ourselves. Cc: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
* Btrfs: don't abort the current transaction if there is no enough space for ↵Miao Xie2013-05-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | inode cache The filesystem with inode cache was forced to be read-only when we umounted it. Steps to reproduce: # mkfs.btrfs -f ${DEV} # mount -o inode_cache ${DEV} ${MNT} # dd if=/dev/zero of=${MNT}/file1 bs=1M count=8192 # btrfs fi syn ${MNT} # dd if=${MNT}/file1 of=/dev/null bs=1M # rm -f ${MNT}/file1 # btrfs fi syn ${MNT} # umount ${MNT} It is because there was no enough space to do inode cache truncation, and then we aborted the current transaction. But no space error is not a serious problem when we write out the inode cache, and it is safe that we just skip this step if we meet this problem. So we need not abort the current transaction. Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Tested-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
* Btrfs: improve the noflush reservationMiao Xie2012-12-111-2/+3
| | | | | | | | | | | | | | | | In some places(such as: evicting inode), we just can not flush the reserved space of delalloc, flushing the delayed directory index and delayed inode is OK, but we don't try to flush those things and just go back when there is no enough space to be reserved. This patch fixes this problem. We defined 3 types of the flush operations: NO_FLUSH, FLUSH_LIMIT and FLUSH_ALL. If we can in the transaction, we should not flush anything, or the deadlock would happen, so use NO_FLUSH. If we flushing the reserved space of delalloc would cause deadlock, use FLUSH_LIMIT. In the other cases, FLUSH_ALL is used, and we will flush all things. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
* Btrfs: show useful info in space reservation tracepointLiu Bo2012-03-291-4/+2
| | | | | | | | o For space info, the type of space info is useful for debug. o For transaction handle, its transid is useful. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* btrfs: replace many BUG_ONs with proper error handlingJeff Mahoney2012-03-221-7/+12
| | | | | | | | | | | btrfs currently handles most errors with BUG_ON. This patch is a work-in- progress but aims to handle most errors other than internal logic errors and ENOMEM more gracefully. This iteration prevents most crashes but can run into lockups with the page lock on occasion when the timing "works out." Signed-off-by: Jeff Mahoney <jeffm@suse.com>
* Btrfs: fix compiler warnings on 32 bit systemsChris Mason2012-02-241-2/+4
| | | | | | | The enospc tracing code added some interesting uses of u64 pointer casts. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: space leak tracepointsJosef Bacik2012-01-161-0/+4
| | | | | | | This in addition to a script in my btrfs-tracing tree will help track down space leaks when we're getting space left over in block groups on umount. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
* Btrfs: fix no reserved space for writing out inode cacheMiao Xie2011-11-111-4/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I-node cache forgets to reserve the space when writing out it. And when we do some stress test, such as synctest, it will trigger WARN_ON() in use_block_rsv(). WARNING: at fs/btrfs/extent-tree.c:5718 btrfs_alloc_free_block+0xbf/0x281 [btrfs]() ... Call Trace: [<ffffffff8104df86>] warn_slowpath_common+0x80/0x98 [<ffffffff8104dfb3>] warn_slowpath_null+0x15/0x17 [<ffffffffa0369c60>] btrfs_alloc_free_block+0xbf/0x281 [btrfs] [<ffffffff810cbcb8>] ? __set_page_dirty_nobuffers+0xfe/0x108 [<ffffffffa035c040>] __btrfs_cow_block+0x118/0x3b5 [btrfs] [<ffffffffa035c7ba>] btrfs_cow_block+0x103/0x14e [btrfs] [<ffffffffa035e4c4>] btrfs_search_slot+0x249/0x6a4 [btrfs] [<ffffffffa036d086>] btrfs_lookup_inode+0x2a/0x8a [btrfs] [<ffffffffa03788b7>] btrfs_update_inode+0xaa/0x141 [btrfs] [<ffffffffa036d7ec>] btrfs_save_ino_cache+0xea/0x202 [btrfs] [<ffffffffa03a761e>] ? btrfs_update_reloc_root+0x17e/0x197 [btrfs] [<ffffffffa0373867>] commit_fs_roots+0xaa/0x158 [btrfs] [<ffffffffa03746a6>] btrfs_commit_transaction+0x405/0x731 [btrfs] [<ffffffff810690df>] ? wake_up_bit+0x25/0x25 [<ffffffffa039d652>] ? btrfs_log_dentry_safe+0x43/0x51 [btrfs] [<ffffffffa0381c5f>] btrfs_sync_file+0x16a/0x198 [btrfs] [<ffffffff81122806>] ? mntput+0x21/0x23 [<ffffffff8112d150>] vfs_fsync_range+0x18/0x21 [<ffffffff8112d170>] vfs_fsync+0x17/0x19 [<ffffffff8112d316>] do_fsync+0x29/0x3e [<ffffffff8112d348>] sys_fsync+0xb/0xf [<ffffffff81468352>] system_call_fastpath+0x16/0x1b Sometimes it causes BUG_ON() in the reservation code of the delayed inode is triggered. So we must reserve enough space for inode cache. Note: If we can not reserve the enough space for inode cache, we will give up writing out it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: handle enospc accounting for free space inodesJosef Bacik2011-10-191-2/+4
| | | | | | | | | Since free space inodes now use normal checksumming we need to make sure to account for their metadata use. So reserve metadata space, and then if we fail to write out the metadata we can just release it, otherwise it will be freed up when the io completes. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
* btrfs: add helper for fs_info->closingDavid Sterba2011-06-041-2/+1
| | | | | | | wrap checking of filesystem 'closing' flag and fix a few missing memory barriers. Signed-off-by: David Sterba <dsterba@suse.cz>
* Btrfs: add mount -o inode_cacheChris Mason2011-06-041-0/+20
| | | | | | | | This makes the inode map cache default to off until we fix the overflow problem when the free space crcs don't fit inside a single page. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: don't save the inode cache if we are deleting this rootJosef Bacik2011-06-041-0/+5
| | | | | | | | | | | | | | With xfstest 254 I can panic the box every time with the inode number caching stuff on. This is because we clean the inodes out when we delete the subvolume, but then we write out the inode cache which adds an inode to the subvolume inode tree, and then when it gets evicted again the root gets added back on the dead roots list and is deleted again, so we have a double free. To stop this from happening just return 0 if refs is 0 (and we're not the tree root since tree root always has refs of 0). With this fix 254 no longer panics. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Tested-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: don't save the inode cache in non-FS rootsliubo2011-06-041-0/+6
| | | | | | | | | This adds extra checks to make sure the inode map we are caching really belongs to a FS root instead of a special relocation tree. It prevents crashes during balancing operations. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: setup free ino caching in a more asynchronous wayLi Zefan2011-05-261-6/+22
| | | | | | | | | | | | For a filesystem that has lots of files in it, the first time we mount it with free ino caching support, it can take quite a long time to setup the caching before we can create new files. Here we fill the cache with [highest_ino, BTRFS_LAST_FREE_OBJECTID] before we start the caching thread to search through the extent tree. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Merge branch 'cleanups' of git://repo.or.cz/linux-2.6/btrfs-unstable into ↵Chris Mason2011-05-221-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | inode_numbers Conflicts: fs/btrfs/extent-tree.c fs/btrfs/free-space-cache.c fs/btrfs/inode.c fs/btrfs/tree-log.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
* | Btrfs: Support reading/writing on disk free ino cacheLi Zefan2011-04-251-0/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is similar to block group caching. We dedicate a special inode in fs tree to save free ino cache. At the very first time we create/delete a file after mount, the free ino cache will be loaded from disk into memory. When the fs tree is commited, the cache will be written back to disk. To keep compatibility, we check the root generation against the generation of the special inode when loading the cache, so the loading will fail if the btrfs filesystem was mounted in an older kernel before. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
* | Btrfs: Cache free inode numbers in memoryLi Zefan2011-04-251-5/+336
|/ | | | | | | | | | | | | | | | | | | | | | | | Currently btrfs stores the highest objectid of the fs tree, and it always returns (highest+1) inode number when we create a file, so inode numbers won't be reclaimed when we delete files, so we'll run out of inode numbers as we keep create/delete files in 32bits machines. This fixes it, and it works similarly to how we cache free space in block cgroups. We start a kernel thread to read the file tree. By scanning inode items, we know which chunks of inode numbers are free, and we cache them in an rb-tree. Because we are searching the commit root, we have to carefully handle the cross-transaction case. The rb-tree is a hybrid extent+bitmap tree, so if we have too many small chunks of inode numbers, we'll use bitmaps. Initially we allow 16K ram of extents, and a bitmap will be used if we exceed this threshold. The extents threshold is adjusted in runtime. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
* Btrfs: cleanup some BUG_ON()Tsutomu Itoh2011-03-281-1/+2
| | | | | | | | This patch changes some BUG_ON() to the error return. (but, most callers still use BUG_ON()) Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: do not reuse objectid of deleted snapshot/subvolYan, Zheng2009-09-211-78/+15
| | | | | | | | | | | | | The new back reference format does not allow reusing objectid of deleted snapshot/subvol. So we use ++highest_objectid to allocate objectid for new snapshot/subvol. Now we use ++highest_objectid to allocate objectid for both new inode and new snapshot/subvolume, so this patch removes 'find hole' code in btrfs_find_free_objectid. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Fix a trivial warning using max() of u64 vs ULL.Joel Becker2009-04-271-1/+1
| | | | | | | | | | | A small warning popped up on ia64 because inode-map.c was comparing a u64 object id with the ULL FIRST_FREE_OBJECTID. My first thought was that all the OBJECTID constants should contain the u64 cast because btrfs code deals entirely in u64s. But then I saw how large that was, and figured I'd just fix the max() call. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: remove btrfs_init_pathJeff Mahoney2009-02-121-1/+0
| | | | | | | | | | | btrfs_init_path was initially used when the path objects were on the stack. Now all the work is done by btrfs_alloc_path and btrfs_init_path isn't required. This patch removes it, and just uses kmem_cache_zalloc to zero out the object. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Fix checkpatch.pl warningsChris Mason2009-01-061-1/+0
| | | | | | | There were many, most are fixed now. struct-funcs.c generates some warnings but these are bogus. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: extent_map and data=ordered fixes for space balancingZheng Yan2008-09-261-0/+4
| | | | | | | | | | | | | | | | | | | | | * Add an EXTENT_BOUNDARY state bit to keep the writepage code from merging data extents that are in the process of being relocated. This allows us to do accounting for them properly. * The balancing code relocates data extents indepdent of the underlying inode. The extent_map code was modified to properly account for things moving around (invalidating extent_map caches in the inode). * Don't take the drop_mutex in the create_subvol ioctl. It isn't required. * Fix walking of the ordered extent list to avoid races with sys_unlink * Change the lock ordering rules. Transaction start goes outside the drop_mutex. This allows btrfs_commit_transaction to directly drop the relocation trees. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Update find free objectid function for orphan cleanup codeZheng Yan2008-09-251-7/+8
| | | | | | | | | | Orphan items use BTRFS_ORPHAN_OBJECTID (-5UUL) as key objectid. This affects the find free objectid functions, inode objectid can easily overflow after orphan file cleanup. --- Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Replace the big fs_mutex with a collection of other locksChris Mason2008-09-251-0/+8
| | | | | | | | Extent alloctions are still protected by a large alloc_mutex. Objectid allocations are covered by a objectid mutex Other btree operations are protected by a lock on individual btree nodes Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Fix for btrfs_find_free_objectidYan2008-09-251-3/+1
| | | | | | | | btrfs_find_free_objectid may return a used objectid due to arithmetic underflow. This bug may happen when parameter 'root' is tree root, so it may cause serious problems when creating snapshot or sub-volume. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Create extent_buffer interface for large blocksizesChris Mason2008-09-251-8/+9
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: trivial include fixupsZach Brown2007-07-111-1/+0
| | | | | | | | | | | Almost none of the files including module.h need to do so, remove them. Include sched.h in extent-tree.c to silence a warning about cond_resched() being undeclared. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: add GPLv2Chris Mason2007-06-121-0/+18
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: drop the inode map treeChris Mason2007-04-101-61/+4
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: dirindex optimizationsChris Mason2007-04-051-15/+33
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: tweak the inode-map and free extent search starts on cold mountChris Mason2007-04-041-7/+20
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: corruptions fixedChris Mason2007-04-021-18/+25
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: corruption hunt continuesChris Mason2007-03-301-1/+1
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: reference counts on data extentsChris Mason2007-03-271-0/+1
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* btrfs_create, btrfs_write_super, btrfs_sync_fsChris Mason2007-03-231-0/+1
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Mountable btrfs, with readdirChris Mason2007-03-221-4/+4
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: initial move to kernel module landChris Mason2007-03-211-4/+1
| | | | Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: Add inode map, and the start of file extent itemsChris Mason2007-03-201-0/+136
Signed-off-by: Chris Mason <chris.mason@oracle.com>