summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* quota: Convert __DQUOT_PARANOIA symbol to standard config optionJan Kara2010-04-202-9/+15
| | | | | | | | | | | | Make __DQUOT_PARANOIA define from the old days a standard config option and turn it off by default. This gets rid of a quota warning about writes before quota is turned on for systems with ext4 root filesystem. Currently there's no way to legally solve this because /etc/mtab has to be written before quota is turned on on most systems. Signed-off-by: Jan Kara <jack@suse.cz>
* quota: Fix possible dq_flags corruptionAndrew Perepechko2010-04-121-6/+6
| | | | | | | | | | dq_flags are modified non-atomically in do_set_dqblk via __set_bit calls and atomically for example in mark_dquot_dirty or clear_dquot_dirty. Hence a change done by an atomic operation can be overwritten by a change done by a non-atomic one. Fix the problem by using atomic bitops even in do_set_dqblk. Signed-off-by: Andrew Perepechko <andrew.perepechko@sun.com> Signed-off-by: Jan Kara <jack@suse.cz>
* quota: Hide warnings about writes to the filesystem before quota was turned onJan Kara2010-04-121-0/+6
| | | | | | | | | | For a root filesystem write to the filesystem before quota is turned on happens regularly and there's no way around it because of writes to syslog, /etc/mtab, and similar. So the warning is rather pointless for ordinary users. It's still useful during development so we just hide the warning behind __DQUOT_PARANOIA config option. Signed-off-by: Jan Kara <jack@suse.cz>
* ext3: symlink must be handled via filesystem specific operationDmitry Monakhov2010-04-121-0/+2
| | | | | | | | generic setattr implementation is no longer responsible for quota transfer so synlinks must be handled via ext3_setattr. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>
* ext2: symlink must be handled via filesystem specific operationDmitry Monakhov2010-04-121-0/+2
| | | | | | | | generic setattr implementation is no longer responsible for quota transfer so synlinks must be handled via ext2_setattr. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>
* Merge branch 'for-linus' of ↵Linus Torvalds2010-04-094-7/+7
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/mlx4: Check correct variable for allocation failure RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp() RDMA/cm: Set num_paths when manually assigning path records IB/cm: Fix device_create() return value check
| *---. Merge branches 'cma', 'misc', 'mlx4' and 'nes' into for-linusRoland Dreier2010-04-093-7/+6
| |\ \ \
| | | | * RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp()Chien Tung2010-04-071-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cap.max_inline_data is incorrectly set in init_attr instead of attr. Set it in attr so subsequent init_attr.cap assignment will get the correct value. Signed-off-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | | * | IB/mlx4: Check correct variable for allocation failureDan Carpenter2010-04-071-1/+1
| | | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intent here is to check the "mfrpl->mapped_page_list" allocation. We checked "mfrpl->ibfrpl.page_list" earlier. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * / IB/cm: Fix device_create() return value checkJani Nikula2010-03-311-1/+1
| | |/ | | | | | | | | | | | | | | | | | | Use IS_ERR() instead of comparing to NULL. Signed-off-by: Jani Nikula <ext-jani.1.nikula@nokia.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * / RDMA/cm: Set num_paths when manually assigning path recordsSean Hefty2010-04-071-0/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | When manually assigning the path records to use for a connection, save the number of paths that were set. Otherwise, checks against num_path will show 0, even though path record data is available. This was discovered by manually setting the path records from user space, then querying the kernel to see if the correct path records were assigned, only to discover that the kernel returned 0 path records to the query. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6Linus Torvalds2010-04-098-26/+55
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: [S390] Update default configuration. [S390] nss: add missing .previous statement to asm function [S390] increase default size of vmalloc area [S390] s390: disable change bit override [S390] fix io_return critical section cleanup [S390] sclp_async: potential buffer overflow [S390] arch/s390/kernel: Add missing unlock
| * | [S390] Update default configuration.Martin Schwidefsky2010-04-091-10/+30
| | | | | | | | | | | | Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | [S390] nss: add missing .previous statement to asm functionHeiko Carstens2010-04-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The savesys_ipl_nss asm function is put into the .init.text section however it is missing a ".previous" section which would restore the previous section. Luckily all functions in early.c are init functions so it doesn't matter currently. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | [S390] increase default size of vmalloc areaMartin Schwidefsky2010-04-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default size of the vmalloc area is currently 1 GB. The memory resource controller uses about 10 MB of vmalloc space per gigabyte of memory. That turns a system with more than ~100 GB memory unbootable with the default vmalloc size. It costs us nothing to increase the default size to some more adequate value, e.g. 128 GB. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | [S390] s390: disable change bit overrideChristian Borntraeger2010-04-091-8/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 6a985c6194017de2c062916ad1cd00dee0302c40 ([S390] s390: use change recording override for kernel mapping) deactivated the change bit recording for the kernel mapping to improve the performance. This works most of the time, but there are cases (e.g. kernel runs in home space, futex atomic compare xcmg) where we modify user memory with the kernel mapping instead of the user mapping. Instead of fixing these cases, this patch just deactivates change bit override to avoid future problems with other kernel code that might use the kernel mapping for user memory. CC: stable@kernel.org Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | [S390] fix io_return critical section cleanupMartin Schwidefsky2010-04-092-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a machine check interrupts the io interrupt handler on one of the instructions between io_return and io_leave the critical section cleanup code will move the return psw to io_work_loop. By doing that the switch from the asynchronous interrupt stack to the process stack is skipped. If e.g. TIF_NEED_RESCHED is set things break because the scheduler is called with the asynchronous interrupts stack. Moving the psw back to io_return instead fixes the problem. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | [S390] sclp_async: potential buffer overflowDan Carpenter2010-04-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "len" hasn't been properly range checked so we shouldn't use it as an array offset. This can only be written to by root but it would still be annoying to accidentally write more than 3 characters and corrupt your memory. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | [S390] arch/s390/kernel: Add missing unlockJulia Lawall2010-04-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the default case the lock is not unlocked. The return is converted to a goto, to share the unlock at the end of the function. A simplified version of the semantic patch that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r exists@ expression E1; identifier f; @@ f (...) { <+... * spin_lock_irq (E1,...); ... when != E1 * return ...; ...+> } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2010-04-0934-178/+475
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.dk/linux-2.6-block: (34 commits) cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch loop: Update mtime when writing using aops block: expose the statistics in blkio.time and blkio.sectors for the root cgroup backing-dev: Handle class_create() failure Block: Fix block/elevator.c elevator_get() off-by-one error drbd: lc_element_by_index() never returns NULL cciss: unlock on error path cfq-iosched: Do not merge queues of BE and IDLE classes cfq-iosched: Add additional blktrace log messages in CFQ for easier debugging i2o: Remove the dangerous kobj_to_i2o_device macro block: remove 16 bytes of padding from struct request on 64bits cfq-iosched: fix a kbuild regression block: make CONFIG_BLK_CGROUP visible Remove GENHD_FL_DRIVERFS block: Export max number of segments and max segment size in sysfs block: Finalize conversion of block limits functions block: Fix overrun in lcm() and move it to lib vfs: improve writeback_inodes_wb() paride: fix off-by-one test drbd: fix al-to-on-disk-bitmap for 4k logical_block_size ...
| * | | cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatchDivyesh Shah2010-04-091-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CFQ dispatches requests forcefully due to a barrier or changing iosched, it runs through all cfqq's dispatching requests and then expires each queue. However, it does not activate a cfqq before flushing its IOs resulting in using stale values for computing slice_used. This patch fixes it by calling activate queue before flushing reuqests from each queue. This is useful mostly for barrier requests because when the iosched is changing it really doesnt matter if we have incorrect accounting since we're going to break down all structures anyway. We also now expire the current timeslice before moving on with the dispatch to accurately account slice used for that cfqq. Signed-off-by: Divyesh Shah<dpshah@google.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | loop: Update mtime when writing using aopsNikanth Karthikesan2010-04-081-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update mtime when writing to backing filesystem using the address space operations write_begin and write_end. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | block: expose the statistics in blkio.time and blkio.sectors for the root cgroupRicky Benitez2010-04-051-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the io statistics for the root cgroup are maintained, but they are not shown because the device information is not available at the point that the root blkio cgroup is created. This patch updates the device information when the statistics are updated so that the statistics become visible. Signed-off-by: Ricky Benitez <rickyb@google.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | backing-dev: Handle class_create() failureAnton Blanchard2010-04-021-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I hit this when we had a bug in IDR for a few days. Basically sysfs would fail to create new inodes since it uses an IDR and therefore class_create would fail. While we are unlikely to see this fail we may as well handle it instead of oopsing. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | Block: Fix block/elevator.c elevator_get() off-by-one errorwzt.wzt@gmail.com2010-04-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | elevator_get() not check the name length, if the name length > sizeof(elv), elv will miss the '\0'. And elv buffer will be replace "-iosched" as something like aaaaaaaaa, then call request_module() can load an not trust module. Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | drbd: lc_element_by_index() never returns NULLPhilipp Reisner2010-04-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | cciss: unlock on error pathDan Carpenter2010-04-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We take the spin_lock again in fail_all_cmds() so we need to unlock here. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | cfq-iosched: Do not merge queues of BE and IDLE classesDivyesh Shah2010-03-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even if they are found to be co-operating. The prio_trees do not have any IDLE cfqqs on them. cfq_close_cooperator() is called from cfq_select_queue() and cfq_completed_request(). The latter ensures that the close cooperator code does not get invoked if the current cfqq is of class IDLE but the former doesn't seem to have any such checks. So an IDLE cfqq may get merged with a BE cfqq from the same group which should be avoided. Signed-off-by: Divyesh Shah<dpshah@google.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | cfq-iosched: Add additional blktrace log messages in CFQ for easier debuggingDivyesh Shah2010-03-251-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These have helped us debug some issues we've noticed in earlier IO controller versions and should be useful now as well. The extra logging covers: - idling behavior. Since there are so many conditions based on which we decide to idle or not, this patch adds a log message for some conditions that we've found useful. - workload slices and current prio and workload type Changelog from v1: o moved log message from cfq_set_active_queue() to __cfq_set_active_queue() o changed queue_count to st->count Signed-off-by: Divyesh Shah<dpshah@google.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | i2o: Remove the dangerous kobj_to_i2o_device macroFerenc Wagner2010-03-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This macro worked only when applied to variables named 'kobj'. While this could have been fixed by simply renaming the macro argument, a more type-safe replacement by an inline function would be preferred. However, nobody uses this macro, so it's simpler to just remove it. Signed-off-by: Ferenc Wagner <wferi@niif.hu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | block: remove 16 bytes of padding from struct request on 64bitsRichard Kennedy2010-03-191-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove alignment padding to shrink struct request from 336 to 320 bytes so needing one fewer cacheline and therefore removing 48 bytes from struct request_queue. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | Merge branch 'master' into for-linusJens Axboe2010-03-191756-49229/+63811
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: block/Kconfig Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | cfq-iosched: fix a kbuild regressionShaohua Li2010-03-191-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alex Shi reported a kbuild regression which is about 10% performance lost. He bisected to this commit: 3dde36ddea3e07dd025c4c1ba47edec91606fec0. The reason is cfqq_close() can't find close cooperator. Restoring cfq_rq_close()'s threshold to original value makes the regression go away. Since for_preempt parameter isn't used anymore, this patch deletes it. Reported-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Acked-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: make CONFIG_BLK_CGROUP visibleLi Zefan2010-03-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the config visible, so we can choose from CONFIG_BLK_CGROUP=y and CONFIG_BLK_CGROUP=m when CONFIG_IOSCHED_CFQ=m. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | Remove GENHD_FL_DRIVERFSNeilBrown2010-03-162-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This flag is not used, so best discarded. Signed-off-by: NeilBrown <neilb@suse.de> -- Hi Jens, I came across this recently - these are the only two occurances of "GENHD_FL_DRIVERFS" in the kernel, so it cannot be needed. NeilBrown Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: Export max number of segments and max segment size in sysfsMartin K. Petersen2010-03-151-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These two values are useful when debugging issues surrounding maximum I/O size. Put them in sysfs with the rest of the queue limits. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: Finalize conversion of block limits functionsMartin K. Petersen2010-03-153-28/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove compatibility wrappers and update remaining drivers. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: Fix overrun in lcm() and move it to libMartin K. Petersen2010-03-154-11/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lcm() was defined to take integer-sized arguments. The supplied arguments are multiplied, however, causing us to overflow given sufficiently large input. That in turn led to incorrect optimal I/O size reporting in some cases (RAID over RAID). Switch lcm() over to unsigned long similar to gcd() and move the function from blk-settings.c to lib. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | vfs: improve writeback_inodes_wb()Edward Shishkin2010-03-122-60/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not pin/unpin superblock for every inode in writeback_inodes_wb(), pin it for the whole group of inodes which belong to the same superblock and call writeback_sb_inodes() handler for them. Signed-off-by: Edward Shishkin <edward.shishkin@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | paride: fix off-by-one testRoel Kluin2010-03-123-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With `while (j++ < PX_SPIN)' j reaches PX_SPIN + 1 after the loop. This is probably unlikely to produce a problem. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | drbd: fix al-to-on-disk-bitmap for 4k logical_block_sizeLars Ellenberg2010-03-111-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to now, applying the in-core activity-log to the on-disk bitmap did not care for logical_block_size. On logical_block_size != 512 byte, this very likely results in misalligned block access and spurious "io errors". We now simply always submit aligned whole 4k blocks, fixing this for logical block sizes of 512, 1024, 2048 and 4096. For even larger logical block sizes, this won't work. But I'm not aware of devices with such properties being available. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| * | | | drbd: Renamed overwrite_peer to primary_forcePhilipp Reisner2010-03-112-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| * | | | drbd: Forcing primary should also work for Consistent disks [Bugz 266]Philipp Reisner2010-03-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to now this only worked for Outdated and Inconsistent disks, that it did not worked for Consistent disks was an inconsistent omission. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| * | | | drbd: Make sure we do not send state updates during an empty resync [Bugz 271]Philipp Reisner2010-03-111-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a race condition that existed for ages. The previous commit reduces the window, this one closes it. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| * | | | drbd: Reduce the time an empty resync takes usuallyPhilipp Reisner2010-03-113-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This mitigates changes introduced with commit: http://git.drbd.org/?p=drbd-8.3.git;a=commit;h=4b6803a3276652da3737 Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| * | | | drbd: add missing drbd command names to avoid <NULL> in error messagesLars Ellenberg2010-03-111-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cmdname() should map command number to its human readable representation. The string table was incomplete, though. Maybe rather do a switch() block, and let the compiler help us to keep it complete? Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| * | | | drbd_disconnect: grab meta.socket mutex as wellLars Ellenberg2010-03-112-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes a race and potential kernel panic if e.g. the worker was just about to send a few P_RS_IS_IN_SYNC via the meta socket for checksum based resync, while the receiver destroys the sockets in drbd_disconnect. To make sure no-one is using the meta socket, it is not enough to stop the asender... Grab the meta socket mutex before destroying it. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| * | | | fix unit of rs_same_csums accountingLars Ellenberg2010-03-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Depending on resync request size, we need to account for more than one bit. Impact: cosmetic If SyncTarget reported correctly 100% equal checksums, the SyncSource usually reported 12% equal checksums instead, because it only counted requests, we typically do 32k resync requests, and the bitmap granularity is still 4k. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| * | | | drbd: fix broken state change after split-brain attach while connectedLars Ellenberg2010-03-111-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Situation: we have diverging data sets, i.e. we had a split brain somewhen, but currently are connected, one node diskless. Then we try to attach that disk, figure it is consistent, but has a diverging data set, we refuse to attach. This led to strange state changes: 22:18:35 bb drbd1: peer( Unknown -> Primary ) conn( WFReportParams -> Connected) pdsk( DUnknown -> UpToDate ) 22:19:30 bb drbd1: disk( Diskless -> Attaching ) 22:19:30 bb drbd1: disk( Attaching -> Negotiating ) 22:19:30 bb drbd1: drbd_sync_handshake: 22:19:30 bb drbd1: self 97BF25798B9D5222:F33D1F62ADE698DD:4269796F9D027C83:AC45D8B5C3C1BF93 bits:19449 flags:0 22:19:30 bb drbd1: peer 280DFB6E125465D3:F33D1F62ADE698DC:4269796F9D027C82:AC45D8B5C3C1BF93 bits:2575806 flags:0 22:19:30 bb drbd1: uuid_compare()=100 by rule 90 22:19:30 bb drbd1: Split-Brain detected, dropping connection! 22:19:30 bb drbd1: disk( Negotiating -> Diskless ) while the other side says: 22:19:30 aa drbd1: Split-Brain detected, dropping connection! 22:19:30 aa drbd1: Disk attach process on the peer node was aborted. 22:19:30 aa drbd1: conn( Connected -> TOO_LARGE ) pdsk( Diskless -> Consistent ) This should be fixed now. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| * | | | drbd: fix NULL pointer dereference on 4k hard sect sizeLars Ellenberg2010-03-111-19/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | we still don't support 4k 'physical' sectors 'natively', but use a read-modify-write workaround. And we even tried to use the extra page before we allocated it :( Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>