linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	Btrfs: inline csums if we're fsyncing	Josef Bacik	2012-12-17	3	-1/+21
\| \| \| \| \| \| \| \| \| \|	The tree logging stuff needs the csums to be on the ordered extents in order to log them properly, so mark that we're sync and inline the csum creation so we don't have to wait on the csumming to be done when logging extents that are still in flight. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: don't bother copying if we're only logging the inode	Josef Bacik	2012-12-17	1	-6/+34
\| \| \| \| \| \| \| \| \| \|	We don't copy inode items anwyay, we just copy them straight into the log from the in memory inode. So if we know we're only logging the inode, don't bother dropping anything, just try to insert it and either if it succeeds or we get EEXIST we can update the inode item in the log and carry on. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: only log the inode item if we can get away with it	Josef Bacik	2012-12-17	4	-2/+11
\| \| \| \| \| \| \| \| \| \|	Currently we copy all the file information into the log, inode item, the refs, xattrs etc. Except most of this doesn't change from fsync to fsync, just the inode item changes. So set a flag if an xattr changes or a link is added, and otherwise only log the inode item. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: rename root_times_lock to root_item_lock	Anand Jain	2012-12-17	5	-16/+16
\| \| \| \| \| \| \| \| \|	Originally root_times_lock was introduced as part of send/receive code however newly developed patch to label the subvol reused the same lock, so renaming it for a meaningful name. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	btrfs: Notify udev when removing device	Lukas Czerner	2012-12-17	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently udev does not know about the device being removed from the file system. This may result in the situation where we're unable to mount the file system by UUID or by LABEL because the by-uuid and by-label links may still point to the device which is no longer part of the btrfs file system and hence does not have any btrfs super block. It can be easily reproduced by the following: mkfs.btrfs -L bugfs /dev/loop[0-6] mount /dev/loop0 /mnt/test btrfs device delete /dev/loop0 /mnt/test umount /mnt/test mount LABEL=bugfs /mnt/test <---- this fails then see: ls -l /dev/disk/by-label/bugfs which will still point to the /dev/loop0 We did not noticed this before because libblkid would send the udev event for us when it notice that the link does not fit the reality, however it does not do that anymore and completely relies on udev information. Fix this by sending the KOBJ_CHANGE event to the bdev kobject after successful device removal. Note that this does not affect device addition, because we will open the device prior the addition from userspace and udev will notice that and reread the device afterwards. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix wrong return value of btrfs_truncate_page()	Miao Xie	2012-12-17	1	-2/+1
\| \| \| \| \| \| \| \| \|	ret variant may be set to 0 if we read page successfully, but it might be released before we lock it again. On this case, if we fail to allocate a new page, we will return 0, it is wrong, fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: punch hole past the end of the file	Miao Xie	2012-12-17	1	-10/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Since we can pre-allocate the space past EOF, we should be able to reclaim that space if we need. This patch implements it by removing the EOF check. Though the manual of fallocate command says we can use truncate command to reclaim the pre-allocated space which past EOF, but because truncate command changes the file size, we must run several commands to reclaim the space if we don't want to change the file size, so it is not a good choice. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix the page that is beyond EOF	Miao Xie	2012-12-17	1	-7/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Steps to reproduce: # mkfs.btrfs <disk> # mount <disk> <mnt> # dd if=/dev/zero of=<mnt>/<file> bs=512 seek=5 count=8 # fallocate -p -o 2048 -l 16384 <mnt>/<file> # dd if=/dev/zero of=<mnt>/<file> bs=4096 seek=3 count=8 conv=notrunc,nocreat # umount <mnt> # dmesg WARNING: at fs/btrfs/inode.c:7140 btrfs_destroy_inode+0x2eb/0x330 The reason is that we inputed a range which is beyond the end of the file. And because the end of this range was not page-aligned, we had to truncate the last page in this range, this operation is similar to a buffered file write. In other words, we reserved enough space and clear the data which was in the hole range on that page. But when we expanded that test file, write the data into the same page, we forgot that we have reserved enough space for the buffered write of that page because in most cases there is no page that is beyond the end of the file. As a result, we reserved the space twice. In fact, we needn't truncate the page if it is beyond the end of the file, just release the allocated space in that range. Fix the above problem by this way. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix off-by-one error of the same page check in btrfs_punch_hole()	Miao Xie	2012-12-17	1	-2/+2
\| \| \| \| \| \| \| \|	(start + len) is the start of the adjacent extent, not the end of the current extent, so we should not use it to check the hole is on the same page or not. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix missing reserved space release in error path of delalloc reservation	Miao Xie	2012-12-17	1	-0/+7
\| \| \| \| \| \| \| \|	We forget to release the reserved space in the error path of delalloc reservatiom, fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: don't auto defrag a file when doing directIO	Miao Xie	2012-12-17	1	-3/+0
\| \| \| \| \| \| \| \|	If we runt the direct IO, we should not run auto defrag, because it may introduce buffered IO vs direcIO problem, and make direct IO slow down. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: parse parent 0 into correct value in tracepoint	Liu Bo	2012-12-17	1	-1/+2
\| \| \| \| \| \| \| \|	Value 0 is not a tree id, so besides an upper limit, a lower limit is necessary as well while parsing root types of tracepoint. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: use ctl->unit for free space calculation instead of ↵	Wang Sheng-Hui	2012-12-17	1	-11/+9
\| \| \| \| \| \| \| \| \| \|	block_group->sectorsize We should use ctl->unit for free space calculation instead of block_group->sectorsize even though for free space use_bitmap or free space cluster we only have sectorsize assigned to ctl->unit currently. Also, we can keep it consisten in code style. Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: refactor error handling to drop inode in btrfs_create()	Filipe Brandenburger	2012-12-17	1	-12/+11
\| \| \| \| \| \| \| \| \|	Refactor it by checking whether the inode has been created and needs to be dropped (drop_inode_on_err) and also if the err variable is set. That way the variable doesn't need to be set on each and every error handling block. Signed-off-by: Filipe Brandenburger <filbranden@google.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix permissions of empty files not affected by umask	Filipe Brandenburger	2012-12-17	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a new file is created with btrfs_create(), the inode will initially be created with permissions 0666 and later on in btrfs_init_acl() it will be adapted to mask out the umask bits. The problem is that this change won't make it into the btrfs_inode unless there's another change to the inode (e.g. writing content changing the size or touching the file changing the mtime.) This fix adds a call to btrfs_update_inode() to btrfs_create() to make sure that the change will not get lost if the in-memory inode is flushed before other changes are made to the file. Signed-off-by: Filipe Brandenburger <filbranden@google.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: add fiemap's flag check	Tsutomu Itoh	2012-12-17	1	-0/+8
\| \| \| \| \| \| \| \| \|	When the flag not supported is specified, it is necessary to return the error to the caller. So, we add the validity check of the fiemap's flag. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: don't add a NULL extended attribute	Liu Bo	2012-12-17	1	-0/+10
\| \| \| \| \| \| \| \|	Passing a null extended attribute value means to remove the attribute, but we don't have to add a new NULL extended attribute. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: skip adding an acl attribute if we don't have to	Liu Bo	2012-12-17	1	-0/+2
\| \| \| \| \| \| \| \|	If the acl can be exactly represented in the traditional file mode permission bits, we don't set another acl attribute. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix off-by-one error of the reserved size of btrfs_allocate()	Miao Xie	2012-12-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	alloc_end is not the real end of the current extent, it is the start of the next adjoining extent. So we needn't +1 when calculating the size the space that is about to be reserved. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: use existing align macros in btrfs_allocate()	Miao Xie	2012-12-17	1	-4/+4
\| \| \| \| \| \| \| \|	The kernel developers have implemented some often-used align macros, we should use them instead of the complex code. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix a scrub regression in case of write errors	Stefan Behrens	2012-12-17	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	This regression was introduced by the device-replace patches. Scrub immediately stops checking those disks that have write errors. This is nothing that happens in the real world, but it is wrong since scrub is the tool to detect and repair defects. Fix it. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix a build warning for an unused label	Stefan Behrens	2012-12-17	1	-1/+0
\| \| \| \| \| \| \| \| \| \|	This issue was detected by the "0-DAY kernel build testing". fs/btrfs/volumes.c: In function 'btrfs_rm_device': fs/btrfs/volumes.c:1505:1: warning: label 'error_close' defined but not used [-Wunused-label] Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix race in check-integrity caused by usage of bitfield	Stefan Behrens	2012-12-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The structure member mirror_num is modified concurrently to the structure member is_iodone. This doesn't require any locking by design, unless everything is stored in the same 32 bits of a bit field. This was the case and xfstest 284 was able to trigger false warnings from the checker code. This patch seperates the bits and fixes the race. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix freeze vs auto defrag	Miao Xie	2012-12-17	1	-0/+3
\| \| \| \| \| \| \|	If we freeze the fs, the auto defragment should not run. Fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: restructure btrfs_run_defrag_inodes()	Miao Xie	2012-12-17	3	-91/+109
\| \| \| \| \| \| \| \|	This patch restructure btrfs_run_defrag_inodes() and make the code of the auto defragment more readable. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix unprotected defragable inode insertion	Miao Xie	2012-12-17	1	-15/+55
\| \| \| \| \| \| \| \|	We forget to get the defrag lock when we re-add the defragable inode, Fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: use slabs for auto defrag allocation	Miao Xie	2012-12-17	3	-5/+34
\| \| \| \| \| \| \| \| \| \|	The auto defrag allocation is in the fast path of the IO, so use slabs to improve the speed of the allocation. And besides that, it can do check for leaked objects when the module is removed. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: get write access for qgroup operations	Miao Xie	2012-12-17	1	-25/+48
\| \| \| \| \| \| \|	We need get write access for qgroup operations, or we will modify the R/O fs. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: get write access for scrub	Miao Xie	2012-12-17	1	-3/+13
\| \| \| \| \| \| \|	We need get write access for scrub, or we will modify the R/O fs. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: get write access when removing a device	Miao Xie	2012-12-17	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Steps to reproduce: # mkfs.btrfs -d single -m single <disk0> <disk1> # mount -o ro <disk0> <mnt0> # mount -o ro <disk0> <mnt1> # mount -o remount,rw <mnt0> # umount <mnt0> # btrfs device delete <disk1> <mnt1> We can remove a device from a R/O filesystem. The reason is that we just check the R/O flag of the super block object. It is not enough, because the kernel may set the R/O flag only for the mount point. We need invoke mnt_want_write_file() to do a full check. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: get write access when doing resize fs	Miao Xie	2012-12-17	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Steps to reproduce: # mkfs.btrfs <partition> # mount -o ro <partition> <mnt0> # mount -o ro <partition> <mnt1> # mount -o remount,rw <mnt0> # umount <mnt0> # btrfs fi resize 10g <mnt1> We re-sized a R/O filesystem. The reason is that we just check the R/O flag of the super block object. It is not enough, because the kernel may set the R/O flag only for the mount point. We need invoke mnt_want_write_file() to do a full check. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: get write access when setting the default subvolume	Miao Xie	2012-12-17	1	-12/+28
\| \| \| \| \| \| \| \|	When wen want to set the default subvolume, we must get write access, or we will change the R/O file system. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix wrong return value of btrfs_wait_for_commit()	Miao Xie	2012-12-17	1	-7/+8
\| \| \| \| \| \| \| \| \|	If the id of the existed transaction is more than the one we specified, it means the specified transaction was commited, so we should return 0, not EINVAL. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: don't start a new transaction when starting sync	Miao Xie	2012-12-17	2	-9/+18
\| \| \| \| \| \| \| \|	If there is no running transaction in the fs, we needn't start a new one when we want to start sync. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: pass root object into btrfs_ioctl_{start, wait}_sync()	Miao Xie	2012-12-17	1	-6/+6
\| \| \| \| \| \| \| \|	Since we have gotten the root in the caller, just pass it into btrfs_ioctl_{start, wait}_sync() directly. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: fix an while-loop of listxattr	Liu Bo	2012-12-17	1	-1/+1
\| \| \| \| \| \| \|	If we found an invalid xattr dir item, we'd better try the next one instead. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: do not warn_on io_ctl->cur in io_ctl_map_page	Wang Sheng-Hui	2012-12-17	1	-1/+0
\| \| \| \| \| \| \| \| \|	io_ctl_map_page is called by many functions in free-space-cache. In most scenarios, the ->cur is not null, e.g. io_ctl_add_entry. I think we'd better remove the warn_on here. Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: add support for device replace ioctls	Stefan Behrens	2012-12-17	2	-3/+52
\| \| \| \| \| \| \| \| \| \| \| \|	This is the commit that allows to start the device replace procedure. An ioctl() interface is added that supports starting and canceling the device replace procedure, and to retrieve the status and progress. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: allow repair code to include target disk when searching mirrors	Stefan Behrens	2012-12-12	1	-5/+154
\| \| \| \| \| \| \| \| \| \| \| \|	Make the target disk of a running device replace operation available for reading. This is only used as a last ressort for the defect repair procedure. And it is dependent on the location of the data block to read, because during an ongoing device replace operation, the target drive is only partially filled with the filesystem data. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: increase BTRFS_MAX_MIRRORS by one for dev replace	Stefan Behrens	2012-12-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This change of the define is effective in all modes, it is required and used only in the case when a device replace procedure is running. The reason is that during an active device replace procedure, the target device of the copy operation is a mirror for the filesystem data as well that can be used to read data in order to repair read errors on other disks. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: optionally avoid reads from device replace source drive	Stefan Behrens	2012-12-12	1	-11/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is desirable to be able to configure the device replace procedure to avoid reading the source drive (the one to be copied) whenever possible. This is useful when the number of read errors on this disk is high, because it would delay the copy procedure alot. Therefore there is an option to avoid reading from the source disk unless the repair procedure really needs to access it. The regular read req asks for mapping the block with mirror_num == 0, in this case the source disk is avoided whenever possible. The repair code selects the mirror_num explicitly (mirror_num != 0), this case is not changed by this commit. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: changes to live filesystem are also written to replacement disk	Stefan Behrens	2012-12-12	1	-1/+49
\| \| \| \| \| \| \| \| \| \| \|	During a running dev replace operation, all write requests to the live filesystem are duplicated to also write to the target drive. Therefore btrfs_map_block() is changed to duplicate stripes that are written to the source disk of a device replace procedure to be written to the target disk as well. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: introduce GET_READ_MIRRORS functionality for btrfs_map_block()	Stefan Behrens	2012-12-12	4	-7/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before this commit, btrfs_map_block() was called with REQ_WRITE in order to retrieve the list of mirrors for a disk block. This needs to be changed for the device replace procedure since it makes a difference whether you are asking for read mirrors or for locations to write to. GET_READ_MIRRORS is introduced as a new interface to call btrfs_map_block(). In the current commit, the functionality is not yet changed, only the interface for GET_READ_MIRRORS is introduced and all the places that should use this new interface are adapted. The reason that REQ_WRITE cannot be abused anymore to retrieve a list of read mirrors is that during a running dev replace operation all write requests to the live filesystem are duplicated to also write to the target drive. Keep in mind that the target disk is only partially a valid copy of the source disk while the operation is ongoing. All writes go to the target disk, but not all reads would return valid data on the target disk. Therefore it is not possible anymore to abuse a REQ_WRITE interface to find valid mirrors for a REQ_READ. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: change core code of btrfs to support the device replace operations	Stefan Behrens	2012-12-12	7	-14/+111
\| \| \| \| \| \| \| \|	This commit contains all the essential changes to the core code of Btrfs for support of the device replace procedure. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: add new sources for device replace code	Stefan Behrens	2012-12-12	7	-1/+1069
\| \| \| \| \| \| \| \| \| \|	This adds a new file to the sources together with the header file and the changes to ioctl.h and ctree.h that are required by the new C source file. Additionally, 4 new functions are added to volume.c that deal with device creation and destruction. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: add code to scrub to copy read data to another disk	Stefan Behrens	2012-12-12	5	-73/+851
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The device replace procedure makes use of the scrub code. The scrub code is the most efficient code to read the allocated data of a disk, i.e. it reads sequentially in order to avoid disk head movements, it skips unallocated blocks, it uses read ahead mechanisms, and it contains all the code to detect and repair defects. This commit adds code to scrub to allow the scrub code to copy read data to another disk. One goal is to be able to perform as fast as possible. Therefore the write requests are collected until huge bios are built, and the write process is decoupled from the read process with some kind of flow control, of course, in order to limit the allocated memory. The best performance on spinning disks could by reached when the head movements are avoided as much as possible. Therefore a single worker is used to interface the read process with the write process. The regular scrub operation works as fast as before, it is not negatively influenced and actually it is more or less unchanged. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: handle errors from btrfs_map_bio() everywhere	Stefan Behrens	2012-12-12	6	-33/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the addition of the device replace procedure, it is possible for btrfs_map_bio(READ) to report an error. This happens when the specific mirror is requested which is located on the target disk, and the copy operation has not yet copied this block. Hence the block cannot be read and this error state is indicated by returning EIO. Some background information follows now. A new mirror is added while the device replace procedure is running. btrfs_get_num_copies() returns one more, and btrfs_map_bio(GET_READ_MIRROR) adds one more mirror if a disk location is involved that was already handled by the device replace copy operation. The assigned mirror num is the highest mirror number, e.g. the value 3 in case of RAID1. If btrfs_map_bio() is invoked with mirror_num == 0 (i.e., select any mirror), the copy on the target drive is never selected because that disk shall be able to perform the write requests as quickly as possible. The parallel execution of read requests would only slow down the disk copy procedure. Second case is that btrfs_map_bio() is called with mirror_num > 0. This is done from the repair code only. In this case, the highest mirror num is assigned to the target disk, since it is used last. And when this mirror is not available because the copy procedure has not yet handled this area, an error is returned. Everywhere in the code the handling of such errors is added now. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: disallow some operations on the device replace target device	Stefan Behrens	2012-12-12	7	-18/+54
\| \| \| \| \| \| \| \|	This patch adds some code to disallow operations on the device that is used as the target for the device replace operation. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: disallow mutually exclusive admin operations from user mode	Stefan Behrens	2012-12-12	3	-17/+40
\| \| \| \| \| \| \| \| \| \|	Btrfs admin operations that are manually started from user mode and that cannot be executed at the same time return -EINPROGRESS. A common way to enter and leave this locked section is introduced since it used to be specific to the balance operation. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
*	Btrfs: introduce a btrfs_dev_replace_item type	Stefan Behrens	2012-12-12	2	-0/+69
\| \| \| \| \|	Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>