linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	md: support updating bitmap parameters via sysfs.	NeilBrown	2009-12-14	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A new attribute directory 'bitmap' in 'md' is created which contains files for configuring the bitmap. 'location' identifies where the bitmap is, either 'none', or 'file' or 'sector offset from metadata'. Writing 'location' can create or remove a bitmap. Adding a 'file' bitmap this way is not yet supported. 'chunksize' and 'time_base' must be set before 'location' can be set. 'chunksize' can be set before creating a bitmap, but is currently always over-ridden by the bitmap superblock. 'time_base' and 'backlog' can be updated at any time. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Andre Noll <maan@systemlinux.org>
*	md: factor out parsing of fixed-point numbers	NeilBrown	2009-12-14	1	-22/+42
\| \| \| \| \| \| \| \|	safe_delay_store can parse fixed point numbers (for fractions of a second). We will want to do that for another sysfs file soon, so factor out the code. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: move offset, daemon_sleep and chunksize out of bitmap structure	NeilBrown	2009-12-14	1	-2/+5
\| \| \| \| \| \| \|	... and into bitmap_info. These are all configuration parameters that need to be set before the bitmap is created. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: collect bitmap-specific fields into one structure.	NeilBrown	2009-12-14	1	-37/+42
\| \| \| \| \| \| \| \|	In preparation for making bitmap fields configurable via sysfs, start tidying up by making a single structure to contain the configuration fields. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: support barrier requests on all personalities.	NeilBrown	2009-12-14	1	-2/+103
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously barriers were only supported on RAID1. This is because other levels requires synchronisation across all devices and so needed a different approach. Here is that approach. When a barrier arrives, we send a zero-length barrier to every active device. When that completes - and if the original request was not empty - we submit the barrier request itself (with the barrier flag cleared) and then submit a fresh load of zero length barriers. The barrier request itself is asynchronous, but any subsequent request will block until the barrier completes. The reason for clearing the barrier flag is that a barrier request is allowed to fail. If we pass a non-empty barrier through a striping raid level it is conceivable that part of it could succeed and part could fail. That would be way too hard to deal with. So if the first run of zero length barriers succeed, we assume all is sufficiently well that we send the request and ignore errors in the second run of barriers. RAID5 needs extra care as write requests may not have been submitted to the underlying devices yet. So we flush the stripe cache before proceeding with the barrier. Note that the second set of zero-length barriers are submitted immediately after the original request is submitted. Thus when a personality finds mddev->barrier to be set during make_request, it should not return from make_request until the corresponding per-device request(s) have been queued. That will be done in later patches. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Andre Noll <maan@systemlinux.org>
*	md: don't reset curr_resync_completed after an interrupted resync	NeilBrown	2009-12-14	1	-1/+3
\| \| \| \| \| \| \| \| \|	If a resync/recovery/check/repair is interrupted for some reason, it can be useful to know exactly where it got up to. So in that case, do not clear curr_resync_completed. Initialise it when starting a resync/recovery/... instead. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: adjust resync_min usefully when resync aborts.	NeilBrown	2009-12-14	1	-3/+7
\| \| \| \| \| \| \| \| \| \|	When a 'check' or 'repair' finished we should clear resync_min so that a future check/repair will cover the whole array (by default). However if it is interrupted, we should update resync_min to where we got up to, so that when the check/repair continues it just does the remainder of the array. Signed-off-by: NeilBrown <neilb@suse.de>
*	md/bitmap: protect against bitmap removal while being updated.	NeilBrown	2009-12-14	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A write intent bitmap can be removed from an array while the array is active. When this happens, all IO is suspended and flushed before the bitmap is removed. However it is possible that bitmap_daemon_work is still running to clear old bits from the bitmap. If it is, it can dereference the bitmap after it has been freed. So introduce a new mutex to protect bitmap_daemon_work and get it before destroying a bitmap. This is suitable for any current -stable kernel. Signed-off-by: NeilBrown <neilb@suse.de> Cc: stable@kernel.org
*	Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6	Linus Torvalds	2009-12-08	1	-9/+5
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6: (43 commits) security/tomoyo: Remove now unnecessary handling of security_sysctl. security/tomoyo: Add a special case to handle accesses through the internal proc mount. sysctl: Drop & in front of every proc_handler. sysctl: Remove CTL_NONE and CTL_UNNUMBERED sysctl: kill dead ctl_handler definitions. sysctl: Remove the last of the generic binary sysctl support sysctl net: Remove unused binary sysctl code sysctl security/tomoyo: Don't look at ctl_name sysctl arm: Remove binary sysctl support sysctl x86: Remove dead binary sysctl support sysctl sh: Remove dead binary sysctl support sysctl powerpc: Remove dead binary sysctl support sysctl ia64: Remove dead binary sysctl support sysctl s390: Remove dead sysctl binary support sysctl frv: Remove dead binary sysctl support sysctl mips/lasat: Remove dead binary sysctl support sysctl drivers: Remove dead binary sysctl support sysctl crypto: Remove dead binary sysctl support sysctl security/keys: Remove dead binary sysctl support sysctl kernel: Remove binary sysctl logic ...
\| *	sysctl: Drop & in front of every proc_handler.	Eric W. Biederman	2009-11-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For consistency drop & in front of every proc_handler. Explicity taking the address is unnecessary and it prevents optimizations like stubbing the proc_handlers to NULL. Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
\| *	Merge commit 'v2.6.32-rc7'	Eric W. Biederman	2009-11-17	1	-2/+3
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Resolve the conflict between v2.6.32-rc7 where dn_def_dev_handler gets a small bug fix and the sysctl tree where I am removing all sysctl strategy routines.
\| * \|	sysctl drivers: Remove dead binary sysctl support	Eric W. Biederman	2009-11-12	1	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that sys_sysctl is a wrapper around /proc/sys all of the binary sysctl support elsewhere in the tree is dead code. Cc: Jens Axboe <axboe@kernel.dk> Cc: Corey Minyard <minyard@acm.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Matt Mackall <mpm@selenic.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Neil Brown <neilb@suse.de> Cc: "James E.J. Bottomley" <James.Bottomley@suse.de> Acked-by: Clemens Ladisch <clemens@ladisch.de> for drivers/char/hpet.c Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* \| \|	md: allow v0.91 metadata to record devices as being active but not in-sync.	NeilBrown	2009-11-13	1	-6/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a combination that didn't really make sense before. However when a reshape is converting e.g. raid5 -> raid6, the extra device is not fully in-sync, but is certainly active and contains important data. So allow that start to be meaningful and in particular get the 'recovery_offset' value (which is needed for any non-in-sync active device) from the reshape_position. Signed-off-by: NeilBrown <neilb@suse.de>
* \| \|	md: factor out updating of 'recovery_offset'.	NeilBrown	2009-11-12	1	-2/+8
\| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Each device has its own 'recovery_offset' showing how far recovery has progressed on the device. As the only real significance of this is that fact that it can be stored in the metadata and recovered at restart, and as only 1.x metadata can do this, we were only updating 'recovery_offset' to 'curr_resync_completed' when updating v1.x metadata. But this is wrong, and we will shortly make limited use of this field in v0.90 metadata. So move the update into common code. Signed-off-by: NeilBrown <neilb@suse.de>
* \|	md: don't clear endpoint for resync when resync is interrupted.	NeilBrown	2009-11-06	1	-2/+3
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a 'sync_max' has been set (via sysfs), it is wrong to clear it until a resync (or reshape or recovery ...) actually reached that point. So if a resync is interrupted (e.g. by device failure), leave 'resync_max' unchanged. This is particularly important for 'reshape' operations that do not change the size of the array. For such operations mdadm needs to monitor the reshape taking rolling backups of the section being reshaped. If resync_max gets cleared, the reshape can get ahead of mdadm and then the backups that mdadm creates are useless. This is suitable for 2.6.31.y stable kernels. Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de>
*	md: Fix handling of raid5 array which is being reshaped to fewer devices.	NeilBrown	2009-10-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a raid5 (or raid6) array is being reshaped to have fewer devices, conf->raid_disks is the latter and hence smaller number of devices. However sometimes we want to use a number which is the total number of currently required devices - the larger of the 'old' and 'new' sizes. Before we implemented reducing the number of devices, this was always 'new' i.e. ->raid_disks. Now we need max(raid_disks, previous_raid_disks) in those places. This particularly affects assembling an array that was shutdown while in the middle of a reshape to fewer devices. md.c needs a similar fix when interpreting the md metadata. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: report device as congested when suspended	NeilBrown	2009-09-23	1	-0/+6
\| \| \| \| \| \| \|	This should writeback from coming when the device is temporarily suspended. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: Improve name of threads created by md_register_thread	NeilBrown	2009-09-23	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The management thread for raid4,5,6 arrays are all called mdX_raid5, independent of the actual raid level, which is wrong and can be confusion. So change md_register_thread to use the name from the personality unless no alternate name (like 'resync' or 'reshape') is given. This is simpler and more correct. Cc: Jinzc <zhenchengjin@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
*	md: remove sparse waring "symbol xxx shadows an earlier one"	NeilBrown	2009-09-23	1	-5/+5
\| \| \| \| \| \| \|	Rename some variable and remove some duplicate definitions to avoid there warnings. None of them are actual errors. Signed-off-by: NeilBrown <neilb@suse.de>
*	const: make block_device_operations const	Alexey Dobriyan	2009-09-22	1	-2/+2
\| \| \| \| \| \|	Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Fix new incorrect error return from do_md_stop.	NeilBrown	2009-08-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	Recent commit c8c00a6915a2e3d10416e8bdd3138429beb96210 changed the exit paths in do_md_stop and was not quite careful enough. There is one path were 'err' now needs to be cleared but it isn't. So setting an array to readonly (with mdadm --readonly) will work, but will incorrectly report and error: ENXIO. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: allow upper limit for resync/reshape to be set when array is read-only	NeilBrown	2009-08-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Normally we only allow the upper limit for a reshape to be decreased when the array not performing a sync/recovery/reshape, otherwise there could be races. But if an array is part-way through a reshape when it is assembled the reshape is started immediately leaving no window to set an upper bound. If the array is started read-only, the reshape will be suspended until the array becomes writable, so that provides a window during which it is perfectly safe to reduce the upper limit of a reshape. So: allow the upper limit (sync_max) to be reduced even if the reshape thread is running, as long as the array is still read-only. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: never advance 'events' counter by more than 1.	NeilBrown	2009-08-13	1	-8/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When assembling arrays, md allows two devices to have different event counts as long as the difference is only '1'. This is to cope with a system failure between updating the metadata on two difference devices. However there are currently times when we update the event count by 2. This was done to keep the event count even when the array is clean and odd when it is dirty, which allows us to avoid writing common update to spare devices and so allow those spares to go to sleep. This is bad for the above reason. So change it to never increase by two. This means that the alignment between 'odd/even' and 'clean/dirty' might take a little longer to attain, but that is only a small cost. The spares will get a few more updates but that will still be spared (;-) most updates and can still go to sleep. Prior to this patch there was a small chance that after a crash an array would fail to assemble due to the overly large event count mismatch. Signed-off-by: NeilBrown <neilb@suse.de>
*	Remove deadlock potential in md_open	NeilBrown	2009-08-10	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A recent commit: commit 449aad3e25358812c43afc60918c5ad3819488e7 introduced the possibility of an A-B/B-A deadlock between bd_mutex and reconfig_mutex. __blkdev_get holds bd_mutex while calling md_open which takes reconfig_mutex, do_md_run is always called with reconfig_mutex held, and it now takes bd_mutex in the call the revalidate_disk. This potential deadlock was not caught by lockdep due to the use of mutex_lock_interruptible_nexted which was introduced by commit d63a5a74dee87883fda6b7d170244acaac5b05e8 do avoid a warning of an impossible deadlock. It is quite possible to split reconfig_mutex in to two locks. One protects the array data structures while it is being reconfigured, the other ensures that an array is never even partially open while it is being deactivated. In particular, the second lock prevents an open from completing between the time when do_md_stop checks if there are any active opens, and the time when the array is either set read-only, or when ->pers is set to NULL. So we can be certain that no IO is in flight as the array is being destroyed. So create a new lock, open_mutex, just to ensure exclusion between 'open' and 'stop'. This avoids the deadlock and also avoids the lockdep warning mentioned in commit d63a5a74d Reported-by: "Mike Snitzer" <snitzer@gmail.com> Reported-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NeilBrown <neilb@suse.de>
*	md: Use revalidate_disk to effect changes in size of device.	NeilBrown	2009-08-03	1	-23/+5
\| \| \| \| \| \| \| \| \| \| \| \|	As revalidate_disk calls check_disk_size_change, it will cause any capacity change of a gendisk to be propagated to the blockdev inode. So use that instead of mucking about with locks and i_size_write. Also add a call to revalidate_disk in do_md_run and a few other places where the gendisk capacity is changed. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: Handle growth of v1.x metadata correctly.	NeilBrown	2009-08-03	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The v1.x metadata does not have a fixed size and can grow when devices are added. If it grows enough to require an extra sector of storage, we need to update the 'sb_size' to match. Without this, md can write out an incomplete superblock with a bad checksum, which will be rejected when trying to re-assemble the array. Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de>
*	md: avoid array overflow with bad v1.x metadata	NeilBrown	2009-08-03	1	-1/+6
\| \| \| \| \| \| \| \|	We trust the 'desc_nr' field in v1.x metadata enough to use it as an index in an array. This isn't really safe. So range-check the value first. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: when a level change reduces the number of devices, remove the excess.	NeilBrown	2009-08-03	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \|	When an array is changed from RAID6 to RAID5, fewer drives are needed. So any device that is made superfluous by the level conversion must be marked as not-active. For the RAID6->RAID5 conversion, this will be a drive which only has 'Q' blocks on it. Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de>
*	md: Push down data integrity code to personalities.	Andre Noll	2009-08-03	1	-30/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch replaces md_integrity_check() by two new public functions: md_integrity_register() and md_integrity_add_rdev() which are both personality-independent. md_integrity_register() is called from the ->run and ->hot_remove methods of all personalities that support data integrity. The function iterates over the component devices of the array and determines if all active devices are integrity capable and if their profiles match. If this is the case, the common profile is registered for the mddev via blk_integrity_register(). The second new function, md_integrity_add_rdev() is called from the ->hot_add_disk methods, i.e. whenever a new device is being added to a raid array. If the new device does not support data integrity, or has a profile different from the one already registered, data integrity for the mddev is disabled. For raid0 and linear, only the call to md_integrity_register() from the ->run method is necessary. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
*	Remove multiple KERN_ prefixes from printk formats	Joe Perches	2009-07-08	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 5fd29d6ccbc98884569d6f3105aeca70858b3e0f ("printk: clean up handling of log-levels and newlines") changed printk semantics. printk lines with multiple KERN_<level> prefixes are no longer emitted as before the patch. <level> is now included in the output on each additional use. Remove all uses of multiple KERN_<level>s in formats. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	md: use interruptible wait when duration is controlled by userspace.	NeilBrown	2009-07-01	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \|	User space can set various limits on an md array so that resync waits when it gets to a certain point, or so that I/O is blocked for a short while. When md is waiting against one of these limit, it should use an interruptible wait so as not to add to the load average, and so are not to trigger a warning if the wait goes on for too long. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: tidy up error paths in md_alloc	NeilBrown	2009-07-01	1	-20/+18
\| \| \| \| \| \| \| \| \|	As the recent bug in md_alloc showed, having a single exit path for unlocking and putting is a good idea. So restructure md_alloc to have a single mutex_unlock and mddev_put, and use gotos where necessary. Found-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
*	md: fix error path when duplicate name is found on md device creation.	NeilBrown	2009-07-01	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	When an md device is created by name (rather than number) we need to check that the name is not already in use. If this check finds a duplicate, we return an error without dropping the lock or freeing the newly create mddev. This patch fixes that. Cc: stable@kernel.org Found-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
*	md: avoid dereferencing NULL pointer when accessing suspend_* sysfs attributes.	NeilBrown	2009-07-01	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	If we try to modify one of the md/ sysfs files suspend_lo or suspend_hi when the array is not active, we dereference a NULL. Protect against that. Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de>
*	md: Move check for bitmap presence to personality code.	Andre Noll	2009-06-18	1	-22/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the superblock of a component device indicates the presence of a bitmap but the corresponding raid personality does not support bitmaps (raid0, linear, multipath, faulty), then something is seriously wrong and we'd better refuse to run such an array. Currently, this check is performed while the superblocks are examined, i.e. before entering personality code. Therefore the generic md layer must know which raid levels support bitmaps and which do not. This patch avoids this layer violation without adding identical code to various personalities. This is accomplished by introducing a new public function to md.c, md_check_no_bitmap(), which replaces the hard-coded checks in the superblock loading functions. A call to md_check_no_bitmap() is added to the ->run method of each personality which does not support bitmaps and assembly is aborted if at least one component device contains a bitmap. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
*	md: remove chunksize rounding from common code.	NeilBrown	2009-06-18	1	-49/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	It is easiest to round sizes to multiples of chunk size in the personality code for those personalities which care. Those personalities now do the rounding, so we can remove that function from common code. Also remove the upper bound on the size of a chunk, and the lower bound on the size of a device (1 chunk), neither of which really buy us anything. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: move assignment of ->utime so that it never gets skipped.	NeilBrown	2009-06-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Currently the assignment to utime gets skipped for 'external' metadata. So move it to the top of the function so that it always gets effected. This is of largely cosmetic interest. Nothing actually depends on ->utime being right for external arrays. "mdadm --monitor" does use it for 0.90 and 1.x arrays, but with mdadm-3.0, this is not important for external metadata. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: Push down reconstruction log message to personality code.	Andre Noll	2009-06-18	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Currently, the md layer checks in analyze_sbs() if the raid level supports reconstruction (mddev->level >= 1) and if reconstruction is in progress (mddev->recovery_cp != MaxSector). Move that printk into the personality code of those raid levels that care (levels 1, 4, 5, 6, 10). Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
*	md: merge reconfig and check_reshape methods.	NeilBrown	2009-06-18	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \|	The difference between these two methods is artificial. Both check that a pending reshape is valid, and perform any aspect of it that can be done immediately. 'reconfig' handles chunk size and layout. 'check_reshape' handles raid_disks. So make them just one method. Signed-off-by: NeilBrown <neilb@suse.de>
*	md: remove unnecessary arguments from ->reconfig method.	NeilBrown	2009-06-18	1	-6/+17
\| \| \| \| \| \| \| \| \| \|	Passing the new layout and chunksize as args is not necessary as the mddev has fields for new_check and new_layout. This is preparation for combining the check_reshape and reconfig methods Signed-off-by: NeilBrown <neilb@suse.de>
*	md: Convert mddev->new_chunk to sectors.	Andre Noll	2009-06-18	1	-14/+15
\| \| \| \| \| \| \| \| \| \|	A straight-forward conversion which gets rid of some multiplications/divisions/shifts. The patch also introduces a couple of new ones, most of which are due to conf->chunk_size still being represented in bytes. This will be cleaned up in subsequent patches. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
*	md: Make mddev->chunk_size sector-based.	Andre Noll	2009-06-18	1	-25/+26
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch renames the chunk_size field to chunk_sectors with the implied change of semantics. Since is_power_of_2(chunk_size) = is_power_of_2(chunk_sectors << 9) = is_power_of_2(chunk_sectors) these bits don't need an adjustment for the shift. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
*	md: prepare for non-power-of-two chunk sizes	raz ben yehuda	2009-06-16	1	-14/+15
\| \| \| \| \| \| \| \| \| \|	Remove chunk size check from md as this is now performed in the run function in each personality. Replace chunk size power 2 code calculations by a regular division. Signed-off-by: raziebe@gmail.com Signed-off-by: NeilBrown <neilb@suse.de>
*	Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block	Linus Torvalds	2009-06-11	1	-1/+1
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block: (153 commits) block: add request clone interface (v2) floppy: fix hibernation ramdisk: remove long-deprecated "ramdisk=" boot-time parameter fs/bio.c: add missing __user annotation block: prevent possible io_context->refcount overflow Add serial number support for virtio_blk, V4a block: Add missing bounce_pfn stacking and fix comments Revert "block: Fix bounce limit setting in DM" cciss: decode unit attention in SCSI error handling code cciss: Remove no longer needed sendcmd reject processing code cciss: change SCSI error handling routines to work with interrupts enabled. cciss: separate error processing and command retrying code in sendcmd_withirq_core() cciss: factor out fix target status processing code from sendcmd functions cciss: simplify interface of sendcmd() and sendcmd_withirq() cciss: factor out core of sendcmd_withirq() for use by SCSI error handling code cciss: Use schedule_timeout_uninterruptible in SCSI error handling code block: needs to set the residual length of a bidi request Revert "block: implement blkdev_readpages" block: Fix bounce limit setting in DM Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txt ... Manually fix conflicts with tracing updates in: block/blk-sysfs.c drivers/ide/ide-atapi.c drivers/ide/ide-cd.c drivers/ide/ide-floppy.c drivers/ide/ide-tape.c include/trace/events/block.h kernel/trace/blktrace.c
\| *	block: Do away with the notion of hardsect_size	Martin K. Petersen	2009-05-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Until now we have had a 1:1 mapping between storage device physical block size and the logical block sized used when addressing the device. With SATA 4KB drives coming out that will no longer be the case. The sector size will be 4KB but the logical block size will remain 512-bytes. Hence we need to distinguish between the physical block size and the logical ditto. This patch renames hardsect_size to logical_block_size. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* \|	md: don't use locked_ioctl.	NeilBrown	2009-05-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	md has no need for the BKL - it does its own locking. So md_ioctl doesn't need to be a locked_ioctl. Signed-off-by: NeilBrown <neilb@suse.de>
* \|	md: don't update curr_resync_completed without also updating reshape_position.	NeilBrown	2009-05-26	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order for the metadata to always be consistent, we mustn't updated curr_resync_completed without also updating reshape_position. The reshape code updates both at the same time. However since commit 97e4f42d62badb0f9fbc27c013e89bc1336a03bc the common md_do_sync will sometimes update curr_resync_completed but is not in a position to update reshape_position. So if MD_RECOVERY_RESHAPE is set (indicating that a reshape is happening, so reshape_position might change), don't update curr_resync_completed in md_do_sync, leave it to the per-personality reshape code. Signed-off-by: NeilBrown <neilb@suse.de>
* \|	md: export 'frozen' resync state through sysfs	NeilBrown	2009-05-26	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The md resync engine has a 'frozen' state which ensures that no resync/recovery. This is used to avoid races. Export this state through the 'sync_action' sysfs attribute so that user-space can benefit and also avoid some races. Signed-off-by: NeilBrown <neilb@suse.de>
* \|	md: improve errno return when setting array_size	NeilBrown	2009-05-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of always returns EINVAL if anything goes wrong when setting the array size, add the option of E2BIG if the size requested is too large. This makes it easier for user-space to be sure what went wrong. Signed-off-by: NeilBrown <neilb@suse.de>
* \|	md: always update level / chunk_size / layout when writing v1.x metadata.	NeilBrown	2009-05-26	1	-0/+3
\|/ \| \| \| \| \| \| \| \|	We previously didn't update these fields when writing the metadata because they could never change. They can now, so we better write them. v0.90 metadata always updated these fields. Signed-off-by: NeilBrown <neilb@suse.de>