summaryrefslogtreecommitdiffstats
path: root/drivers/md/bitmap.h (follow)
Commit message (Collapse)AuthorAgeFilesLines
* md: rename some drivers/md/ files to have an "md-" prefixMike Snitzer2017-10-171-277/+0
| | | | | | | | | | | | | | | Motivated by the desire to illiminate the imprecise nature of DM-specific patches being unnecessarily sent to both the MD maintainer and mailing-list. Which is born out of the fact that DM files also reside in drivers/md/ Now all MD-specific files in drivers/md/ start with either "raid" or "md-" and the MAINTAINERS file has been updated accordingly. Shaohua: don't change module name Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Shaohua Li <shli@fb.com>
* md: move bitmap_destroy to the beginning of __md_stopGuoqing Jiang2017-03-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we have switched to sync way to handle METADATA_UPDATED msg for md-cluster, then process_metadata_update is depended on mddev->thread->wqueue. With the new change, clustered raid could possible hang if array received a METADATA_UPDATED msg after array unregistered mddev->thread, so we need to stop clustered raid (bitmap_destroy -> bitmap_free -> md_cluster_stop) earlier than unregister thread (mddev_detach -> md_unregister_thread). And this change should be safe for non-clustered raid since all writes are stopped before the destroy. Also in md_run, we activate the personality (pers->run()) before activating the bitmap (bitmap_create()). So it is pleasingly symmetric to stop the bitmap (bitmap_destroy()) before stopping the personality (__md_stop() calls pers->free()), we achieve this by move bitmap_destroy to the beginning of __md_stop. But we don't want to break the codes for waiting behind IO as Shaohua mentioned, so introduce bitmap_wait_behind_writes to call the codes, and call the new fun in both mddev_detach and bitmap_destroy, then we will not break original behind IO code and also fit the new condition well. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
* md-cluster: introduce cluster_check_sync_sizeGuoqing Jiang2017-03-171-0/+2
| | | | | | | | | | | | | | | | | | | | | Support resize is a little complex for clustered raid, since we need to ensure all the nodes share the same knowledge about the size of raid. We achieve the goal by check the sync_size which is in each node's bitmap, we can only change the capacity after cluster_check_sync_size returns 0. Also, get_bitmap_from_slot is added to get a slot's bitmap. And we exported some funcs since they are used in cluster_check_sync_size(). We can also reuse get_bitmap_from_slot to remove redundant code existed in bitmap_copy_from_slot. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
* md-cluster: sync bitmap when node received RESYNCING msgGuoqing Jiang2016-05-041-0/+3
| | | | | | | | | | | | | | | | If the node received RESYNCING message which means another node will perform resync with the area, then we don't want to do it again in another node. Let's set RESYNC_MASK and clear NEEDED_MASK for the region from old-low to new-low which has finished syncing, and the region from old-hi to new-hi is about to syncing, bitmap_sync_with_cluste is introduced for the purpose. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
* md: fix typos for stipeGuoqing Jiang2016-03-141-2/+2
| | | | | Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
* md-cluster: Use a small window for resyncGoldwyn Rodrigues2015-10-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Suspending the entire device for resync could take too long. Resync in small chunks. cluster's resync window (32M) is maintained in r1conf as cluster_sync_low and cluster_sync_high and processed in raid1's sync_request(). If the current resync is outside the cluster resync window: 1. Set the cluster_sync_low to curr_resync_completed. 2. Check if the sync will fit in the new window, if not issue a wait_barrier() and set cluster_sync_low to sector_nr. 3. Set cluster_sync_high to cluster_sync_low + resync_window. 4. Send a message to all nodes so they may add it in their suspension list. bitmap_cond_end_sync is modified to allow to force a sync inorder to get the curr_resync_completed uptodate with the sector passed. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
* md: Increment version for clustered bitmapsGoldwyn Rodrigues2015-10-121-0/+2
| | | | | | | | | | | | | | Add BITMAP_MAJOR_CLUSTERED as 5, in order to prevent older kernels to assemble a clustered device. In order to maximize compatibility, the major version is set to BITMAP_MAJOR_CLUSTERED *only* if the bitmap is clustered. Added MD_FEATURE_CLUSTERED in order to return error for older kernels which would assemble MD even if the bitmap is corrupted. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>
* md-cluster: re-add capabilitiesGoldwyn Rodrigues2015-04-211-1/+1
| | | | | | | | | | | | | | | | When "re-add" is writted to /sys/block/mdXX/md/dev-YYY/state, the clustered md: 1. Sends RE_ADD message with the desc_nr. Nodes receiving the message clear the Faulty bit in their respective rdev->flags. 2. The node initiating re-add, gathers the bitmaps of all nodes and copies them into the local bitmap. It does not clear the bitmap from which it is copying. 3. Initiating node schedules a md recovery to sync the devices. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
* Copy set bits from another slotGoldwyn Rodrigues2015-02-231-0/+2
| | | | | | | | | | | | | | bitmap_copy_from_slot reads the bitmap from the slot mentioned. It then copies the set bits to the node local bitmap. This is helper function for the resync operation on node failure. bitmap_set_memory_bits() currently assumes it is only run at startup and that they bitmap is currently empty. So if it finds that a region is already marked as dirty, it won't mark it dirty again. Change bitmap_set_memory_bits() to always set the NEEDED_MASK bit if 'needed' is set. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
* bitmap_create returns bitmap pointerGoldwyn Rodrigues2015-02-231-1/+1
| | | | | | This is done to have multiple bitmaps open at the same time. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
* Use separate bitmaps for each nodes in the clusterGoldwyn Rodrigues2015-02-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | On-disk format: 0 4k 8k 12k ------------------------------------------------------------------- | idle | md super | bm super [0] + bits | | bm bits[0, contd] | bm super[1] + bits | bm bits[1, contd] | | bm super[2] + bits | bm bits [2, contd] | bm super[3] + bits | | bm bits [3, contd] | | | Bitmap super has a field nodes, which defines the maximum number of nodes the device can use. While reading the bitmap super, if the cluster finds out that the number of nodes is > 0: 1. Requests the md-cluster module. 2. Calls md_cluster_ops->join(), which sets up clustering such as joining DLM lockspace. Since the first time, the first bitmap is read. After the call to the cluster_setup, the bitmap offset is adjusted and the superblock is re-read. This also ensures the bitmap is read the bitmap lock (when bitmap lock is introduced in later patches) Questions: 1. cluster name is repeated in all bitmap supers. Is that okay? Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
* Add node recovery callbacksGoldwyn Rodrigues2015-02-231-2/+2
| | | | | | | | | | | | | | | | | DLM offers callbacks when a node fails and the lock remastery is performed: 1. recover_prep: called when DLM discovers a node is down 2. recover_slot: called when DLM identifies the node and recovery can start 3. recover_done: called when all nodes have completed recover_slot recover_slot() and recover_done() are also called when the node joins initially in order to inform the node with its slot number. These slot numbers start from one, so we deduct one to make it start with zero which the cluster-md code uses. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
* Add number of nodes to bitmap structure for clusteringGoldwyn Rodrigues2015-02-231-1/+2
| | | | Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
* kernfs: s/sysfs_dirent/kernfs_node/ and rename its friends accordinglyTejun Heo2013-12-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs has just been separated out from sysfs and we're already in full conflict mode. Nothing can make the situation any worse. Let's take the chance to name things properly. This patch performs the following renames. * s/sysfs_elem_dir/kernfs_elem_dir/ * s/sysfs_elem_symlink/kernfs_elem_symlink/ * s/sysfs_elem_attr/kernfs_elem_file/ * s/sysfs_dirent/kernfs_node/ * s/sd/kn/ in kernfs proper * s/parent_sd/parent/ * s/target_sd/target/ * s/dir_sd/parent/ * s/to_sysfs_dirent()/rb_to_kn()/ * misc renames of local vars when they conflict with the above Because md, mic and gpio dig into sysfs details, this patch ends up modifying them. All are sysfs_dirent renames and trivial. While we can avoid these by introducing a dummy wrapping struct sysfs_dirent around kernfs_node, given the limited usage outside kernfs and sysfs proper, I don't think such workaround is called for. This patch is strictly rename only and doesn't introduce any functional difference. - mic / gpio renames were missing. Spotted by kbuild test robot. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Neil Brown <neilb@suse.de> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* md/bitmap: record the space available for the bitmap in the superblock.NeilBrown2012-05-221-1/+3
| | | | | | | | | | Now that bitmaps can grow and shrink it is best if we record how much space is available. This means that when we reduce the size of the bitmap we won't "lose" the space for late when we might want to increase the size of the bitmap again. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: add bitmap_resize function to allow bitmap resizing.NeilBrown2012-05-221-0/+3
| | | | | | | | | | | | | | | | This function will allocate the new data structures and copy bits across from old to new, allowing for the possibility that the chunksize has changed. Use the same function for performing the initial allocation of the structures. This improves test coverage. When bitmap_resize is used to resize an existing bitmap, it only copies '1' bits in, not '0' bits. So when allocating the bitmap, ensure everything is initialised to ZERO. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: create a 'struct bitmap_counts' substructure of 'struct bitmap'NeilBrown2012-05-221-10/+13
| | | | | | | | | | The new "struct bitmap_counts" contains all the fields that are related to counting the number of active writes in each bitmap chunk. Having this separate will make it easier to change the chunksize or overall size of a bitmap atomically. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: use set_bit, test_bit, etc for operation on bitmap->flags.NeilBrown2012-05-221-3/+3
| | | | | | | | | | We currently use '&' and '|' which isn't the norm in the kernel and doesn't allow easy atomicity. So change to bit numbers and {set,clear,test}_bit. This allows us to remove a spinlock/unlock (which was dubious anyway) and some other simplifications. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: store bytes in file rather than just in last page.NeilBrown2012-05-221-1/+1
| | | | | | | This number is more generally useful, and bytes-in-last-page is easily extracted from it. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: move some fields of 'struct bitmap' into a 'storage' substruct.NeilBrown2012-05-221-6/+11
| | | | | | | | | This new 'struct bitmap_storage' reflects the external storage of the bitmap. Having this clearly defined will make it easier to change the storage used while the array is active. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: disentangle two different 'pending' flags.NeilBrown2012-05-221-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two different 'pending' concepts in the handling of the write intent bitmap. Firstly, a 'page' from the bitmap (which container PAGE_SIZE*8 bits) may have changes (bits cleared) that should be written in due course. There is no hurry for these and the page will transition from PENDING to NEEDWRITE and will then be written, though if it ever becomes DIRTY it will be written much sooner and PENDING will be cleared. Secondly, a page of counters - which contains PAGE_SIZE/2 counters, one for each bit, can usefully have a 'pending' flag which indicates if any of the counters are low (2 or 1) and ready to be processed by bitmap_daemon_work(). If this flag is clear we can skip the whole page. These two concepts are currently combined in the bitmap-file flag. This causes a tighter connection between the counters and the bitmap file than I would like - as I want to add some flexibility to the bitmap file. So introduce a new flag with the page-of-counters, and rewrite bitmap_daemon_work() so that it handles the two different 'pending' concepts separately. This also allows us to clear BITMAP_PAGE_PENDING when we write out a dirty page, which may occasionally reduce the number of times we write a page. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: fix calculation of 'chunks' - missing shift.NeilBrown2012-05-041-3/+0
| | | | | | | | | | | | | | | | commit 61a0d80c "md/bitmap: discard CHUNK_BLOCK_SHIFT macro" replaced CHUNK_BLOCK_RATIO() by the same text that was replacing CHUNK_BLOCK_SHIFT() - which is clearly wrong. The result is that 'chunks' is often too small by 1, which can sometimes result in a crash (not sure how). So use the correct replacement, and get rid of CHUNK_BLOCK_RATIO which is no longe used. Reported-by: Karl Newman <siliconfiend@gmail.com> Tested-by: Karl Newman <siliconfiend@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: discard CHUNK_BLOCK_SHIFT macroNeilBrown2012-03-191-2/+1
| | | | | | | | Be redefining ->chunkshift as the shift from sectors to chunks rather than bytes to chunks, we can just use "bitmap->chunkshift" which is shorter than the macro call, and less indirect. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: move printing of bitmap status to bitmap.cNeilBrown2012-03-191-0/+1
| | | | | | | The part of /proc/mdstat which describes the bitmap should really be generated by code in bitmap.c. So move it there. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: remove some unused noise from bitmap.hNeilBrown2012-03-191-18/+0
| | | | Signed-off-by: NeilBrown <neilb@suse.de>
* md: remove typedefs: mddev_t -> struct mddevNeilBrown2011-10-111-6/+6
| | | | | | Having mddev_t and 'struct mddev_s' is ugly and not preferred Signed-off-by: NeilBrown <neilb@suse.de>
* MD bitmap: Revert DM dirty log hooksJonathan Brassow2011-07-271-5/+0
| | | | | | | | | | | | | | Revert most of commit e384e58549a2e9a83071ad80280c1a9053cfd84c md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log. MD should not need to use DM's dirty log - we decided to use md's bitmaps instead. Keeping the DIV_ROUND_UP clean-ups that were part of commit e384e58549a2e9a83071ad80280c1a9053cfd84c, however. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: remove unused fields from struct bitmapNamhyung Kim2011-06-091-10/+0
| | | | | | | | | Get rid of ->syncchunk and ->counter_bits since they're never used. Also discard COUNTER_BYTE_RATIO which is unused. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
* Fix common misspellingsLucas De Marchi2011-03-311-1/+1
| | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* md: use sector_t in bitmap_get_counterNeilBrown2010-10-281-2/+2
| | | | | | | | | | bitmap_get_counter returns the number of sectors covered by the counter in a pass-by-reference variable. In some cases this can be very large, so make it a sector_t for safety. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: separate out loading a bitmap from initialising the structures.NeilBrown2010-07-261-0/+1
| | | | | | | | | | dm makes this distinction between ->ctr and ->resume, so we need to too. Also get the new bitmap_load to clear out the bitmap first, as this is most consistent with the dm suspend/resume approach Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log.NeilBrown2010-07-261-0/+5
| | | | | | | | | | | | | | This allows md/raid5 to fully work as a dm target. Normally md uses a 'filemap' which contains a list of pages of bits each of which may be written separately. dm-log uses and all-or-nothing approach to writing the log, so when using a dm-log, ->filemap is NULL and the flags normally stored in filemap_attr are stored in ->logattrs instead. Signed-off-by: NeilBrown <neilb@suse.de>
* md/raid1: delay reads that could overtake behind-writes.NeilBrown2010-05-181-0/+1
| | | | | | | | | | | | | When a raid1 array is configured to support write-behind on some devices, it normally only reads from other devices. If all devices are write-behind (because the rest have failed) it is possible for a read request to be serviced before a behind-write request, which would appear as data corruption. So when forced to read from a WriteMostly device, wait for any write-behind to complete, and don't start any more behind-writes. Signed-off-by: NeilBrown <neilb@suse.de>
* md: expose max value of behind writes counterPaul Clements2010-05-181-0/+1
| | | | | | | | | | | | | Keep track of the maximum number of concurrent write-behind requests for an md array and exposed this number in sysfs at md/bitmap/max_backlog_used Writing any value to this file will clear it. This allows userspace to be involved in tuning bitmap/backlog. Signed-off-by: Paul Clements <paul.clements@steeleye.com> Signed-off-by: NeilBrown <neilb@suse.de>
* md: Support write-intent bitmaps with externally managed metadata.NeilBrown2009-12-141-10/+1
| | | | | | | | | | | In this case, the metadata needs to not be in the same sector as the bitmap. md will not read/write any bitmap metadata. Config must be done via sysfs and when a recovery makes the array non-degraded again, writing 'true' to 'bitmap/can_clear' will allow bits in the bitmap to be cleared again. Signed-off-by: NeilBrown <neilb@suse.de>
* md: move offset, daemon_sleep and chunksize out of bitmap structureNeilBrown2009-12-141-5/+1
| | | | | | | ... and into bitmap_info. These are all configuration parameters that need to be set before the bitmap is created. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: protect against bitmap removal while being updated.NeilBrown2009-12-141-1/+1
| | | | | | | | | | | | | | | | | | A write intent bitmap can be removed from an array while the array is active. When this happens, all IO is suspended and flushed before the bitmap is removed. However it is possible that bitmap_daemon_work is still running to clear old bits from the bitmap. If it is, it can dereference the bitmap after it has been freed. So introduce a new mutex to protect bitmap_daemon_work and get it before destroying a bitmap. This is suitable for any current -stable kernel. Signed-off-by: NeilBrown <neilb@suse.de> Cc: stable@kernel.org
* md: move headers out of include/linux/raid/Christoph Hellwig2009-03-311-0/+288
Move the headers with the local structures for the disciplines and bitmap.h into drivers/md/ so that they are more easily grepable for hacking and not far away. md.h is left where it is for now as there are some uses from the outside. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: NeilBrown <neilb@suse.de>