linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fix misspellings of "whether" in comments.	Adam Buchbinder	2012-11-19	1	-1/+1
\| \| \| \| \| \| \| \|	"Whether" is misspelled in various comments across the tree; this fixes them. No code changes. Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
*	Btrfs: don't commit instead of overcommitting	Josef Bacik	2012-10-09	1	-28/+6
\| \| \| \| \| \| \| \| \| \| \|	I don't think we have the same problem that this was supposed to fix originally since we can allocate chunks in the enospc path now. This code is causing us to constantly commit the transaction as we get close to using all of our available space in our currently allocated chunks, instead of allocating another chunk and carrying on with life, which is not nice for performance. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
*	Btrfs: cache extent state when writing out dirty metadata pages	Josef Bacik	2012-10-09	1	-2/+3
\| \| \| \| \| \| \| \| \| \|	Everytime we write out dirty pages we search for an offset in the tree, convert the bits in the state, and then when we wait we search for the offset again and clear the bits. So for every dirty range in the io tree we are doing 4 rb searches, which is suboptimal. With this patch we are only doing 2 searches for every cycle (modulo weird things happening). Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
*	Btrfs: run delayed refs first when out of space	Josef Bacik	2012-10-09	1	-10/+10
\| \| \| \| \| \| \| \|	Running delayed refs is faster than running delalloc, so lets do that first to try and reclaim space. This makes my fs_mark test about 20% faster. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
*	btrfs: move transaction aborts to the point of failure	David Sterba	2012-10-09	1	-21/+35
\| \| \| \| \| \| \| \|	Call btrfs_abort_transaction as early as possible when an error condition is detected, that way the line number reported is useful and we're not clueless anymore which error path led to the abort. Signed-off-by: David Sterba <dsterba@suse.cz>
*	Btrfs: kill obsolete arguments in btrfs_wait_ordered_extents	Liu Bo	2012-10-04	1	-2/+2
\| \| \| \| \| \|	nocow_only is now an obsolete argument. Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
*	Btrfs: cleanup for duplicated code in find_free_extent	Liu Bo	2012-10-04	1	-4/+0
\| \| \| \| \| \| \|	There is already an 'add free space' phrase in front of this one, we needn't to redo it. Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
*	Btrfs: remove bytes argument from do_chunk_alloc	Josef Bacik	2012-10-01	1	-15/+10
\| \| \| \| \| \| \| \|	Everybody is just making stuff up, and it's just used to see if we really do need to alloc a chunk, and since we do this when we already know we really do it's just a waste of space. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
*	Btrfs: delay block group item insertion	Josef Bacik	2012-10-01	1	-67/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	So we have lots of places where we try to preallocate chunks in order to make sure we have enough space as we make our allocations. This has historically meant that we're constantly tweaking when we should allocate a new chunk, and historically we have gotten this horribly wrong so we way over allocate either metadata or data. To try and keep this from happening we are going to make it so that the block group item insertion is done out of band at the end of a transaction. This will allow us to create chunks even if we are trying to make an allocation for the extent tree. With this patch my enospc tests run faster (didn't expect this) and more efficiently use the disk space (this is what I wanted). Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
*	Btrfs: fix our overcommit math	Josef Bacik	2012-10-01	1	-29/+42
\| \| \| \| \| \| \| \| \| \| \| \| \|	I noticed I was seeing large lags when running my torrent test in a vm on my laptop. While trying to make it lag less I noticed that our overcommit math was taking into account the number of bytes we wanted to reclaim, not the number of bytes we actually wanted to allocate, which means we wouldn't overcommit as often. This patch fixes the overcommit math and makes shrink_delalloc() use that logic so that it will stop looping faster. We still have pretty high spikes of latency, but the test now takes 3 minutes less time (about 5% faster). Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
*	Btrfs: wait on async pages when shrinking delalloc	Josef Bacik	2012-10-01	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	Mitch reported a problem where you could get an ENOSPC error when untarring a kernel git tree onto a 16gb file system with compress-force=zlib. This is because compression is a huge pain, it will return from ->writepages() without having actually created any ordered extents. To get around this we check to see if the async submit counter is up, and if it is wait until it drops to 0 before doing our normal ordered wait dance. With this patch I can now untar a kernel git tree onto a 16gb file system without getting ENOSPC errors. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
*	Btrfs: fix wrong size for the reservation of the, snapshot creation	Miao Xie	2012-10-01	1	-3/+3
\| \| \| \| \| \| \|	We should insert/update 6 items(root ref, root backref, dir item, dir index, root item and parent inode) when creating a snapshot, not 5 items, fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
*	Btrfs: add a new "type" field into the block reservation structure	Miao Xie	2012-10-01	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \|	Sometimes we need choose the method of the reservation according to the type of the block reservation, such as the reservation for the delayed inode update. Now we identify the type just by comparing the address of the reservation variants, it is very ugly if it is a temporary one because we need compare it with all the common reservation variants. So we add a new "type" field to keep the type the reservation variants. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
*	Btrfs: add hole punching	Josef Bacik	2012-10-01	1	-0/+2
\| \| \| \| \| \|	This patch adds hole punching via fallocate. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
*	Btrfs: do not needlessly restart the transaction for enospc	Josef Bacik	2012-10-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	We will stop and restart a transaction every time we move to a different leaf when truncating a file. This is for enospc reasons, but really we could probably get away with doing this a little better by actually working until we hit an ENOSPC. So add a ->failfast flag to the block_rsv and set it when we do truncates which will fail as soon as the block rsv runs out of space, and then at that point we can stop and restart the transaction and refill the block rsv and carry on. This will make rm'ing of a file with lots of extents a bit faster. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
*	Btrfs: do not allocate chunks as agressively	Josef Bacik	2012-10-01	1	-9/+3
\| \| \| \| \| \| \| \| \| \|	Swinging this pendulum back the other way. We've been allocating chunks up to 2% of the disk no matter how much we actually have allocated. So instead fix this calculation to only allocate chunks if we have more than 80% of the space available allocated. Please test this as it will likely cause all sorts of ENOSPC problems to pop up suddenly. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
*	Btrfs: allow delayed refs to be merged	Josef Bacik	2012-08-28	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Daniel Blueman reported a bug with fio+balance on a ramdisk setup. Basically what happens is the balance relocates a tree block which will drop the implicit refs for all of its children and adds a full backref. Once the block is relocated we have to add the implicit refs back, so when we cow the block again we add the implicit refs for its children back. The problem comes when the original drop ref doesn't get run before we add the implicit refs back. The delayed ref stuff will specifically prefer ADD operations over DROP to keep us from freeing up an extent that will have references to it, so we try to add the implicit ref before it is actually removed and we panic. This worked fine before because the add would have just canceled the drop out and we would have been fine. But the backref walking work needs to be able to freeze the delayed ref stuff in time so we have this ever increasing sequence number that gets attached to all new delayed ref updates which makes us not merge refs and we run into this issue. So to fix this we need to merge delayed refs. So everytime we run a clustered ref we need to try and merge all of its delayed refs. The backref walking stuff locks the delayed ref head before processing, so if we have it locked we are safe to merge any refs inside of the sequence number. If there is no sequence number we can merge all refs. Doing this not only fixes our bug but keeps the delayed ref code from adding and removing useless refs and batching together multiple refs into one search instead of one search per delayed ref, which will really help our commit times. I ran this with Daniels test and 276 and I haven't seen any problems. Thanks, Reported-by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
*	Btrfs: fix race in run_clustered_refs	Arne Jansen	2012-08-28	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With commit commit d1270cd91f308c9d22b2804720c36ccd32dbc35e Author: Arne Jansen <sensille@gmx.net> Date: Tue Sep 13 15:16:43 2011 +0200 Btrfs: put back delayed refs that are too new I added a window where the delayed_ref's head->ref_mod code can diverge from the sum of the remaining refs, because we release the head->mutex in the middle. This leads to btrfs_lookup_extent_info returning wrong numbers. This patch fixes this by adjusting the head's ref_mod with each delayed ref we run. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: increase the size of the free space cache	Josef Bacik	2012-08-28	1	-8/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	Arne was complaining about the space cache having mismatching generation numbers when debugging a deadlock. This is because we can run out of space in our preallocated range for our space cache if you have a pretty fragmented amount of space in your pinned space. So just increase the amount of space we preallocate for space cache so we can be sure to have enough space. This will only really affect data ranges since their the only chunks that end up larger than 256MB. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix deadlock in wait_for_more_refs	Arne Jansen	2012-08-28	1	-56/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit a168650c introduced a waiting mechanism to prevent busy waiting in btrfs_run_delayed_refs. This can deadlock with btrfs_run_ordered_operations, where a tree_mod_seq is held while waiting for the io to complete, while the end_io calls btrfs_run_delayed_refs. This whole mechanism is unnecessary. If not enough runnable refs are available to satisfy count, just return as count is more like a guideline than a strict requirement. In case we have to run all refs, commit transaction makes sure that no other threads are working in the transaction anymore, so we just assert here that no refs are blocked. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: unlock on error in btrfs_delalloc_reserve_metadata()	Dan Carpenter	2012-08-28	1	-1/+3
\| \| \| \| \| \|	We should release this mutex before returning the error code. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
*	Btrfs: add a barrier before a waitqueue_active check	Chris Mason	2012-07-25	1	-0/+1
\| \| \| \| \| \| \|	We were missing wakeups on the delayed ref waitqueue due to races on waitqueue_active. Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Merge branch 'qgroup' of git://git.jan-o-sch.net/btrfs-unstable into for-linus	Chris Mason	2012-07-25	1	-9/+110
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: fs/btrfs/ioctl.c fs/btrfs/ioctl.h fs/btrfs/transaction.c fs/btrfs/transaction.h Signed-off-by: Chris Mason <chris.mason@fusionio.com>
\| *	Btrfs: hooks to reserve qgroup space	Arne Jansen	2012-07-12	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Like block reserves, reserve a small piece of space on each transaction start and for delalloc. These are the hooks that can actually return EDQUOT to the user. The amount of space reserved is tracked in the transaction handle. Signed-off-by: Arne Jansen <sensille@gmx.net>
\| *	Btrfs: call the qgroup accounting functions	Jan Schmidt	2012-07-12	1	-0/+3
\| \| \| \| \| \| \| \|	Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
\| *	Btrfs: qgroup implementation and prototypes	Arne Jansen	2012-07-12	1	-0/+34
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
\| *	Btrfs: Test code to change the order of delayed-ref processing	Arne Jansen	2012-07-10	1	-0/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Normally delayed refs get processed in ascending bytenr order. This correlates in most cases to the order added. To expose dependencies on this order, we start to process the tree in the middle instead of the beginning. This code is only effective when SCRAMBLE_DELAYED_REFS is defined. Signed-off-by: Arne Jansen <sensille@gmx.net>
\| *	Btrfs: join tree mod log code with the code holding back delayed refs	Jan Schmidt	2012-07-10	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We've got two mechanisms both required for reliable backref resolving (tree mod log and holding back delayed refs). You cannot make use of one without the other. So instead of requiring the user of this mechanism to setup both correctly, we join them into a single interface. Additionally, we stop inserting non-blockers into fs_info->tree_mod_seq_list as we did before, which was of no value. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
* \|	Btrfs: make btrfs's allocation smoothly with preallocation	Liu Bo	2012-07-23	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For backref walking, we've introduce delayed ref's sequence. However, it changes our preallocation behavior. The story is that when we preallocate an extent and then mark it written piece by piece, the ideal case should be that we don't need to COW the extent, which is why we use 'preallocate'. But we may not make use of preallocation, since when we check for cross refs on the extent, we may have two ref entries which have the same content except the sequence value, and we recognize them as cross refs and do COW to allocate another extent. So we end up with several pieces of space instead of an whole extent. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
* \|	Btrfs: kill free_space pointer from inode structure	Li Zefan	2012-07-23	1	-12/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Inodes always allocate free space with BTRFS_BLOCK_GROUP_DATA type, which means every inode has the same BTRFS_I(inode)->free_space pointer. This shrinks struct btrfs_inode by 4 bytes (or 8 bytes on 64 bits). Signed-off-by: Li Zefan <lizefan@huawei.com>
* \|	Btrfs: add ro notification to dump_space_info	Liu Bo	2012-07-23	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Block group has ro attributes, make dump_space_info show it. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
* \|	Btrfs: fix a bug of writting free space cache during balance	Liu Bo	2012-07-23	1	-3/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Here is the whole story: 1) A free space cache consists of two parts: o free space cache inode, which is special becase it's stored in root tree. o free space info, which is stored as the above inode's file data. But we only build up another new inode and does not flush its free space info onto disk when we _clear and setup_ free space cache, and this ends up with that the block group cache's cache_state remains DC_SETUP instead of DC_WRITTEN. And holding DC_SETUP means that we will not truncate this free space cache inode, which means the disk offset of its file extent will remain _unchanged_ at least until next transaction finishes committing itself. 2) We can set a block group readonly when we relocate the block group. However, if the readonly block group covers the disk offset where our free space cache inode is going to write, it will force the free space cache inode into cow_file_range() and it'll end up hitting a BUG_ON. 3) Due to the above analysis, we fix this bug by adding the missing dirty flag. 4) However, it's not over, there is still another case, nospace_cache. With nospace_cache, we do not want to set dirty flag, instead we just truncate free space cache inode and bail out with setting cache state DC_WRITTEN. We can benifit from it since it saves us another 'pre-allocation' part which usually costs a lot. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
* \|	Btrfs: do not abort transaction in prealloc case	Liu Bo	2012-07-23	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During disk balance, we prealloc new file extent for file data relocation, but we may fail in 'no available space' case, and it leads to flipping btrfs into readonly. It is not necessary to bail out and abort transaction since we do have several ways to rescue ourselves from ENOSPC case. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
* \|	Btrfs: kill root from btrfs_is_free_space_inode	Liu Bo	2012-07-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since root can be fetched via BTRFS_I macro directly, we can save an args for btrfs_is_free_space_inode(). Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
* \|	Btrfs: rework shrink_delalloc	Josef Bacik	2012-07-23	1	-57/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	So shrink_delalloc has grown all sorts of cruft over the years thanks to many reworkings of how we track enospc. What happens now as we fill up the disk is we will loop for freaking ever hoping to reclaim a arbitrary amount of space of metadata, this was from when everybody flushed at the same time. Now we only have people flushing one at a time. So instead of trying to reclaim a huge amount of space, just try to flush a decent chunk of space, and stop looping as soon as we have enough free space to satisfy our reservation. This makes xfstests 224 go much faster. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
* \|	Btrfs: change how we indicate we're adding csums	Josef Bacik	2012-07-23	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is weird logic I had to put in place to make sure that when we were adding csums that we'd used the delalloc block rsv instead of the global block rsv. Part of this meant that we had to free up our transaction reservation before we ran the delayed refs since csum deletion happens during the delayed ref work. The problem with this is that when we release a reservation we will add it to the global reserve if it is not full in order to keep us going along longer before we have to force a transaction commit. By releasing our reservation before we run delayed refs we don't get the opportunity to drain down the global reserve for the work we did, so we won't refill it as often. This isn't a problem per-se, it just results in us possibly committing transactions more and more often, and in rare cases could cause those WARN_ON()'s to pop in use_block_rsv because we ran out of space in our block rsv. This also helps us by holding onto space while the delayed refs run so we don't end up with as many people trying to do things at the same time, which again will help us not force commits or hit the use_block_rsv warnings. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
* \|	Btrfs: flush delayed inodes if we're short on space	Josef Bacik	2012-07-23	1	-35/+62
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \|	Those crazy gentoo guys have been complaining about ENOSPC errors on their portage volumes. This is because doing things like untar tends to create lots of new files which will soak up all the reservation space in the delayed inodes. Usually this gets papered over by the fact that we will try and commit the transaction, however if this happens in the wrong spot or we choose not to commit the transaction you will be screwed. So add the ability to expclitly flush delayed inodes to free up space. Please test this out guys to make sure it works since as usual I cannot reproduce. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
*	Btrfs: avoid waiting for delayed refs when we must not	Jan Schmidt	2012-06-27	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We track two conditions to decide if we should sleep while waiting for more delayed refs, the number of delayed refs (num_refs) and the first entry in the list of blockers (first_seq). When we suspect staleness, we save num_refs and do one more cycle. If nothing changes, we then save first_seq for later comparison and do wait_event. We ought to save first_seq the very same moment we're saving num_refs. Otherwise we cannot be sure that nothing has changed and we might start waiting when we shouldn't, which could lead to starvation. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
*	Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into ↵	Chris Mason	2012-05-31	1	-5/+5
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	for-linus Conflicts: fs/btrfs/ulist.h Signed-off-by: Chris Mason <chris.mason@oracle.com>
\| *	Btrfs: don't set for_cow parameter for tree block functions	Jan Schmidt	2012-05-26	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Three callers of btrfs_free_tree_block or btrfs_alloc_tree_block passed parameter for_cow = 1. In fact, these two functions should never mark their tree modification operations as for_cow, because they can change the number of blocks referenced by a tree. Hence, we remove the extra for_cow parameter from these functions and make them pass a zero down. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
* \|	Btrfs: convert the inode bit field to use the actual bit operations	Josef Bacik	2012-05-30	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Miao pointed this out while I was working on an orphan problem that messing with a bitfield where different ranges are protected by different locks doesn't work out right. Turns out we've been doing this forever where we have different parts of the bit field protected by either no lock at all or different locks which could cause all sorts of weird problems including the issue I was hitting. So instead make a runtime_flags thing that we use the normal bit operations on that are all atomic so we can keep having our no/different locking for the different flags and then make force_compress it's own thing so it can be treated normally. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
* \|	Btrfs: cleanup: use consistent lock naming	Dan Carpenter	2012-05-11	1	-1/+1
\|/ \| \| \| \| \| \| \|	It confuses Smatch that we use two names for the same lock. Plus the shorter name is nicer. This doesn't change how the code works, it's just a cleanup. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
*	Btrfs: avoid sleeping in verify_parent_transid while atomic	Chris Mason	2012-05-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	verify_parent_transid needs to lock the extent range to make sure no IO is underway, and so it can safely clear the uptodate bits if our checks fail. But, a few callers are using it with spinlocks held. Most of the time, the generation numbers are going to match, and we don't want to switch to a blocking lock just for the error case. This adds an atomic flag to verify_parent_transid, and changes it to return EAGAIN if it needs to block to properly verifiy things. Signed-off-by: Chris Mason <chris.mason@oracle.com>
*	Btrfs: fix block_rsv and space_info lock ordering	Stefan Behrens	2012-04-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	may_commit_transaction() calls spin_lock(&space_info->lock); spin_lock(&delayed_rsv->lock); and update_global_block_rsv() calls spin_lock(&block_rsv->lock); spin_lock(&sinfo->lock); Lockdep complains about this at run time. Everywhere except in update_global_block_rsv(), the space_info lock is the outer lock, therefore the locking order in update_global_block_rsv() is changed. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@oracle.com>
*	btrfs: don't return EINTR	Arne Jansen	2012-04-18	1	-6/+3
\| \| \| \| \| \| \| \| \| \| \| \|	It is basically a good thing if we are interruptible when waiting for free space, but the generality in which it is implemented currently leads to system calls being interruptible that are not documented this way. For example git can't handle interrupted unlink(), leading to corrupt repos under space pressure. Instead we raise the bar to only be interruptible by SIGKILL. Thanks to David Sterba for suggesting this. Signed-off-by: Arne Jansen <sensille@gmx.net>
*	Btrfs: double unlock bug in error handling	Dan Carpenter	2012-04-18	1	-0/+2
\| \| \| \| \| \| \|	The caller expects this function to return with the lock held and releases it immediately on error. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
*	Btrfs: use commit root when loading free space cache	Josef Bacik	2012-04-13	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	A user reported that booting his box up with btrfs root on 3.4 was way slower than on 3.3 because I removed the ideal caching code. It turns out that we don't load the free space cache if we're in a commit for deadlock reasons, but since we're reading the cache and it hasn't changed yet we are safe reading the inode and free space item from the commit root, so do that and remove all of the deadlock checks so we don't unnecessarily skip loading the free space cache. The user reported this fixed the slowness. Thanks, Tested-by: Calvin Walton <calvin.walton@kepstin.ca> Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
*	Btrfs: remove lock assert from get_restripe_target()	Ilya Dryomov	2012-04-12	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a regression introduced by fc67c450. spin_is_locked() always returns 0 on UP kernels, which caused assert in get_restripe_target() to be fired on every call from btrfs_reduce_alloc_profile() on UP systems. Remove it completely for now, it's not clear if it's going to be needed in future. Reported-by: Bobby Powers <bobbypowers@gmail.com> Reported-by: Mitch Harder <mitch.harder@sabayonlinux.org> Tested-by: Mitch Harder <mitch.harder@sabayonlinux.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
*	Revert "Btrfs: increase the global block reserve estimates"	Chris Mason	2012-04-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	This reverts commit 5500cdbe14d7435e04f66ff3cfb8ecd8b8e44ebf. We've had a number of complaints of early enospc that bisect down to this patch. We'll hae to fix the reservations differently. CC: stable@kernel.org Signed-off-by: Chris Mason <chris.mason@oracle.com>
*	Btrfs: fix deadlock during allocating chunks	Liu Bo	2012-03-29	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \|	This deadlock comes from xfstests 251. We'll hold the chunk_mutex throughout the whole of a chunk allocation. But if we find that we've used up system chunk space, we need to allocate a new system chunk, but this will lead to a recursion of chunk allocation and end up with a deadlock on chunk_mutex. So instead we need to allocate the system chunk first if we find we're in ENOSPC. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>