linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	block: Correct handling of bottom device misaligment	Martin K. Petersen	2010-01-11	1	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \|	The top device misalignment flag would not be set if the added bottom device was already misaligned as opposed to causing a stacking failure. Also massage the reporting so that an error is only returned if adding the bottom device caused the misalignment. I.e. don't return an error if the top is already flagged as misaligned. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Fix incorrect alignment offset reporting and update documentation	Martin K. Petersen	2009-12-29	1	-11/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	queue_sector_alignment_offset returned the wrong value which caused partitions to report an incorrect alignment_offset. Since offset alignment calculation is needed several places it has been split into a separate helper function. The topology stacking function has been updated accordingly. Furthermore, comments have been added to clarify how the stacking function works. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Tested-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Fix topology stacking for data and discard alignment	Martin K. Petersen	2009-12-21	1	-37/+50
\| \| \| \| \| \| \| \| \| \|	The stacking code incorrectly scaled up the data offset in some cases causing misaligned devices to report alignment. Rewrite the stacking algorithm to remedy this and apply the same alignment principles to the discard handling. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: temporarily disable discard granularity	Jens Axboe	2009-12-16	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 86b37281411cf1e9bc0a6b5406c45edb7bd9ea5d adds a check for misaligned stacking offsets, but it's buggy since the defaults are 0. Hence all dm devices that pass in a non-zero starting offset will be marked as misaligned amd dm will complain. A real fix is coming, in the mean time disable the discard granularity check so that users don't worry about dm reporting about misaligned devices. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Allow devices to indicate whether discarded blocks are zeroed	Martin K. Petersen	2009-12-03	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The discard ioctl is used by mkfs utilities to clear a block device prior to putting metadata down. However, not all devices return zeroed blocks after a discard. Some drives return stale data, potentially containing old superblocks. It is therefore important to know whether discarded blocks are properly zeroed. Both ATA and SCSI drives have configuration bits that indicate whether zeroes are returned after a discard operation. Implement a block level interface that allows this information to be bubbled up the stack and queried via a new block device ioctl. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: jiffies fixes	Randy Dunlap	2009-11-11	1	-1/+2
\| \| \| \| \| \| \| \| \|	Use HZ-independent calculation of milliseconds. Add jiffies.h where it was missing since functions or macros from it are used. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Expose discard granularity	Martin K. Petersen	2009-11-10	1	-10/+36
\| \| \| \| \| \| \| \| \| \|	While SSDs track block usage on a per-sector basis, RAID arrays often have allocation blocks that are bigger. Allow the discard granularity and alignment to be set and teach the topology stacking logic how to handle them. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	blk-settings: fix function parameter kernel-doc notation	Randy Dunlap	2009-10-12	1	-1/+1
\| \| \| \| \| \| \|	Fix kernel-doc notation in blk-settings.c::blk_queue_max_discard_sectors(). Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: allow large discard requests	Christoph Hellwig	2009-10-01	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we set the bio size to the byte equivalent of the blocks to be trimmed when submitting the initial DISCARD ioctl. That means it is subject to the max_hw_sectors limitation of the HBA which is much lower than the size of a DISCARD request we can support. Add a separate max_discard_sectors tunable to limit the size for discard requests. We limit the max discard request size in bytes to 32bit as that is the limit for bio->bi_size. This could be much larger if we had a way to pass that information through the block layer. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: use normal I/O path for discard requests	Christoph Hellwig	2009-10-01	1	-17/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	prepare_discard_fn() was being called in a place where memory allocation was effectively impossible. This makes it inappropriate for all but the most trivial translations of Linux's DISCARD operation to the block command set. Additionally adding a payload there makes the ownership of the bio backing unclear as it's now allocated by the device driver and not the submitter as usual. It is replaced with QUEUE_FLAG_DISCARD which is used to indicate whether the queue supports discard operations or not. blkdev_issue_discard now allocates a one-page, sector-length payload which is the right thing for the common ATA and SCSI implementations. The mtd implementation of prepare_discard_fn() is replaced with simply checking for the request being a discard. Largely based on a previous patch from Matthew Wilcox <matthew@wil.cx> which did the prepare_discard_fn but not the different payload allocation yet. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Do not clamp max_hw_sectors for stacking devices	Martin K. Petersen	2009-10-01	1	-1/+2
\| \| \| \| \| \| \| \| \|	Stacking devices do not have an inherent max_hw_sector limit. Set the default to INT_MAX so we are bounded only by capabilities of the underlying storage. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Set max_sectors correctly for stacking devices	Martin K. Petersen	2009-10-01	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	The topology changes unintentionally caused SAFE_MAX_SECTORS to be set for stacking devices. Set the default limit to BLK_DEF_MAX_SECTORS and provide SAFE_MAX_SECTORS in blk_queue_make_request() for legacy hw drivers that depend on the old behavior. Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Optimal I/O limit wrapper	Martin K. Petersen	2009-09-14	1	-1/+20
\| \| \| \| \| \| \| \| \|	Implement blk_limits_io_opt() and make blk_queue_io_opt() a wrapper around it. DM needs this to avoid poking at the queue_limits directly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Update topology documentation	Martin K. Petersen	2009-08-01	1	-6/+13
\| \| \| \| \| \| \| \|	Update topology comments and sysfs documentation based upon discussions with Neil Brown. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Stack optimal I/O size	Martin K. Petersen	2009-08-01	1	-0/+11
\| \| \| \| \| \| \| \| \|	When stacking block devices ensure that optimal I/O size is scaled accordingly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Add a wrapper for setting minimum request size without a queue	Martin K. Petersen	2009-08-01	1	-7/+24
\| \| \| \| \| \| \| \|	Introduce blk_limits_io_min() and make blk_queue_io_min() call it. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Make blk_queue_stack_limits use the new stacking interface	Martin K. Petersen	2009-08-01	1	-21/+1
\| \| \| \| \| \| \| \| \|	blk_queue_stack_limits() has been superceded by blk_stack_limits() and disk_stack_limits(). Wrap the function call for now, we'll deprecate it later. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: always assign default lock to queues	Jens Axboe	2009-07-28	1	-0/+7
\| \| \| \| \| \| \| \| \|	Move the assignment of a default lock below blk_init_queue() to blk_queue_make_request(), so we also get to set the default lock for ->make_request_fn() based drivers. This is important since the queue flag locking requires a lock to be in place. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	Fix kernel-doc parameter name typo in blk-settings.c:	Randy Dunlap	2009-06-19	1	-1/+1
\| \| \| \| \| \| \| \|	Warning(block/blk-settings.c:108): No description found for parameter 'lim' Warning(block/blk-settings.c:108): Excess function parameter 'limits' description in 'blk_set_default_limits' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Fix bounce_pfn setting	Martin K. Petersen	2009-06-18	1	-1/+1
\| \| \| \| \| \| \| \|	Correct stacking bounce_pfn limit setting and prevent warnings on 32-bit. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Introduce helper to reset queue limits to default values	Martin K. Petersen	2009-06-16	1	-6/+27
\| \| \| \| \| \| \| \| \| \| \|	DM reuses the request queue when swapping in a new device table Introduce blk_set_default_limits() which can be used to reset the the queue_limits prior to stacking devices. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Alasdair G Kergon <agk@redhat.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: don't overwrite bdi->state after bdi_init() has been run	Jens Axboe	2009-06-16	1	-4/+0
\| \| \| \| \| \|	Move the defaults to where we do the init of the backing_dev_info. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: fix kernel-doc in recent block/ changes	Randy Dunlap	2009-06-12	1	-3/+3
\| \| \| \| \| \| \|	Fix kernel-doc warnings in recently changed block/ source code. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	block: Add missing bounce_pfn stacking and fix comments	Martin K. Petersen	2009-06-09	1	-2/+3
\| \| \| \| \| \| \| \| \| \|	DM no longer needs to set limits explicitly when calling blk_stack_limits. Let the latter automatically deal with bounce_pfn scaling. Fix kerneldoc variable names. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	Revert "block: Fix bounce limit setting in DM"	Jens Axboe	2009-06-09	1	-17/+0
\| \| \| \| \| \| \| \|	This reverts commit a05c0205ba031c01bba33a21bf0a35920eb64833. DM doesn't need to access the bounce_pfn directly. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Fix bounce limit setting in DM	Martin K. Petersen	2009-06-03	1	-0/+17
\| \| \| \| \| \| \| \| \| \|	blk_queue_bounce_limit() is more than a wrapper about the request queue limits.bounce_pfn variable. Introduce blk_queue_bounce_pfn() which can be called by stacking drivers that wish to set the bounce limit explicitly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: export blk_stack_limits()	Mike Snitzer	2009-05-28	1	-0/+1
\| \| \| \| \| \| \| \|	DM needs to use blk_stack_limits(), so it needs to be exported. Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Export I/O topology for block devices and partitions	Martin K. Petersen	2009-05-22	1	-0/+186
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To support devices with physical block sizes bigger than 512 bytes we need to ensure proper alignment. This patch adds support for exposing I/O topology characteristics as devices are stacked. logical_block_size is the smallest unit the device can address. physical_block_size indicates the smallest I/O the device can write without incurring a read-modify-write penalty. The io_min parameter is the smallest preferred I/O size reported by the device. In many cases this is the same as the physical block size. However, the io_min parameter can be scaled up when stacking (RAID5 chunk size > physical block size). The io_opt characteristic indicates the optimal I/O size reported by the device. This is usually the stripe width for arrays. The alignment_offset parameter indicates the number of bytes the start of the device/partition is offset from the device's natural alignment. Partition tools and MD/DM utilities can use this to pad their offsets so filesystems start on proper boundaries. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Move queue limits to an embedded struct	Martin K. Petersen	2009-05-22	1	-21/+34
\| \| \| \| \| \| \| \|	To accommodate stacking drivers that do not have an associated request queue we're moving the limits to a separate, embedded structure. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Use accessor functions for queue limits	Martin K. Petersen	2009-05-22	1	-3/+12
\| \| \| \| \| \| \| \|	Convert all external users of queue limits to using wrapper functions instead of poking the request queue variables directly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: Do away with the notion of hardsect_size	Martin K. Petersen	2009-05-22	1	-11/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Until now we have had a 1:1 mapping between storage device physical block size and the logical block sized used when addressing the device. With SATA 4KB drives coming out that will no longer be the case. The sector size will be 4KB but the logical block size will remain 512-bytes. Hence we need to distinguish between the physical block size and the logical ditto. This patch renames hardsect_size to logical_block_size. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: fix queue bounce limit setting	Tejun Heo	2009-04-22	1	-9/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Impact: don't set GFP_DMA in q->bounce_gfp unnecessarily All DMA address limits are expressed in terms of the last addressable unit (byte or page) instead of one plus that. However, when determining bounce_gfp for 64bit machines in blk_queue_bounce_limit(), it compares the specified limit against 0x100000000UL to determine whether it's below 4G ending up falsely setting GFP_DMA in q->bounce_gfp. As DMA zone is very small on x86_64, this makes larger SG_IO transfers very eager to trigger OOM killer. Fix it. While at it, rename the parameter to @dma_mask for clarity and convert comment to proper winged style. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	pata_artop: typo	Alan Cox	2009-04-07	1	-1/+1
\| \| \| \| \| \| \| \|	Fix a typo (this was in the original patch but was not merged when the code fixes were for some reason) Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
*	block: use min_not_zero in blk_queue_stack_limits	FUJITA Tomonori	2008-12-29	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	zero is invalid for max_phys_segments, max_hw_segments, and max_segment_size. It's better to use use min_not_zero instead of min. min() works though (because the commit 0e435ac makes sure that these values are set to the default values, non zero, if a queue is initialized properly). With this patch, blk_queue_stack_limits does the almost same thing that dm's combine_restrictions_low() does. I think that it's easy to remove dm's combine_restrictions_low. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: fix setting of max_segment_size and seg_boundary mask	Milan Broz	2008-12-03	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix setting of max_segment_size and seg_boundary mask for stacked md/dm devices. When stacking devices (LVM over MD over SCSI) some of the request queue parameters are not set up correctly in some cases by default, namely max_segment_size and and seg_boundary mask. If you create MD device over SCSI, these attributes are zeroed. Problem become when there is over this mapping next device-mapper mapping - queue attributes are set in DM this way: request_queue max_segment_size seg_boundary_mask SCSI 65536 0xffffffff MD RAID1 0 0 LVM 65536 -1 (64bit) Unfortunately bio_add_page (resp. bio_phys_segments) calculates number of physical segments according to these parameters. During the generic_make_request() is segment cout recalculated and can increase bio->bi_phys_segments count over the allowed limit. (After bio_clone() in stack operation.) Thi is specially problem in CCISS driver, where it produce OOPS here BUG_ON(creq->nr_phys_segments > MAXSGENTRIES); (MAXSEGENTRIES is 31 by default.) Sometimes even this command is enough to cause oops: dd iflag=direct if=/dev/<vg>/<lv> of=/dev/null bs=128000 count=10 This command generates bios with 250 sectors, allocated in 32 4k-pages (last page uses only 1024 bytes). For LVM layer, it allocates bio with 31 segments (still OK for CCISS), unfortunatelly on lower layer it is recalculated to 32 segments and this violates CCISS restriction and triggers BUG_ON(). The patch tries to fix it by: * initializing attributes above in queue request constructor blk_queue_make_request() * make sure that blk_queue_stack_limits() inherits setting (DM uses its own function to set the limits because it blk_queue_stack_limits() was introduced later. It should probably switch to use generic stack limit function too.) * sets the default seg_boundary value in one place (blkdev.h) * use this mask as default in DM (instead of -1, which differs in 64bit) Bugs related to this: https://bugzilla.redhat.com/show_bug.cgi?id=471639 http://bugzilla.kernel.org/show_bug.cgi?id=8672 Signed-off-by: Milan Broz <mbroz@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com> Cc: Neil Brown <neilb@suse.de> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <htejun@gmail.com> Cc: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: move q->unplug_work initialization	Peter Zijlstra	2008-10-17	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	modprobe loop; rmmod loop effectively creates a blk_queue and destroys it which results in q->unplug_work being canceled without it ever being initialized. Therefore, move the initialization of q->unplug_work from blk_queue_make_request() to blk_alloc_queue*(). Reported-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: add lld busy state exporting interface	Kiyoshi Ueda	2008-10-09	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds an new interface, blk_lld_busy(), to check lld's busy state from the block layer. blk_lld_busy() calls down into low-level drivers for the checking if the drivers set q->lld_busy_fn() using blk_queue_lld_busy(). This resolves a performance problem on request stacking devices below. Some drivers like scsi mid layer stop dispatching request when they detect busy state on its low-level device like host/target/device. It allows other requests to stay in the I/O scheduler's queue for a chance of merging. Request stacking drivers like request-based dm should follow the same logic. However, there is no generic interface for the stacked device to check if the underlying device(s) are busy. If the request stacking driver dispatches and submits requests to the busy underlying device, the requests will stay in the underlying device's queue without a chance of merging. This causes performance problem on burst I/O load. With this patch, busy state of the underlying device is exported via q->lld_busy_fn(). So the request stacking driver can check it and stop dispatching requests if busy. The underlying device driver must return the busy state appropriately: 1: when the device driver can't process requests immediately. 0: when the device driver can process requests immediately, including abnormal situations where the device driver needs to kill all requests. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: unify request timeout handling	Jens Axboe	2008-10-09	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Right now SCSI and others do their own command timeout handling. Move those bits to the block layer. Instead of having a timer per command, we try to be a bit more clever and simply have one per-queue. This avoids the overhead of having to tear down and setup a timer for each command, so it will result in a lot less timer fiddling. Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: kmalloc args reversed, small function definition fixes	Harvey Harrison	2008-10-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Noticed by sparse: block/blk-softirq.c:156:12: warning: symbol 'blk_softirq_init' was not declared. Should it be static? block/genhd.c:583:28: warning: function 'bdget_disk' with external linkage has definition block/genhd.c:659:17: warning: incorrect type in argument 1 (different base types) block/genhd.c:659:17: expected unsigned int [unsigned] [usertype] size block/genhd.c:659:17: got restricted gfp_t block/genhd.c:659:29: warning: incorrect type in argument 2 (different base types) block/genhd.c:659:29: expected restricted gfp_t [usertype] flags block/genhd.c:659:29: got unsigned int block: kmalloc args reversed Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: add support for IO CPU affinity	Jens Axboe	2008-10-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for controlling the IO completion CPU of either all requests on a queue, or on a per-request basis. We export a sysfs variable (rq_affinity) which, if set, migrates completions of requests to the CPU that originally submitted it. A bio helper (bio_set_completion_cpu()) is also added, so that queuers can ask for completion on that specific CPU. In testing, this has been show to cut the system time by as much as 20-40% on synthetic workloads where CPU affinity is desired. This requires a little help from the architecture, so it'll only work as designed for archs that are using the new generic smp helper infrastructure. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	Add some block/ source files to the kernel-api docbook. Fix kernel-doc ↵	Randy Dunlap	2008-10-09	1	-4/+4
\| \| \| \| \| \| \|	notation in them as needed. Fix changed function parameter names. Fix typos/spellos. In comments, change REQ_SPECIAL to REQ_TYPE_SPECIAL and REQ_BLOCK_PC to REQ_TYPE_BLOCK_PC. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	Add 'discard' request handling	David Woodhouse	2008-10-09	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some block devices benefit from a hint that they can forget the contents of certain sectors. Add basic support for this to the block core, along with a 'blkdev_issue_discard()' helper function which issues such requests. The caller doesn't get to provide an end_io functio, since blkdev_issue_discard() will automatically split the request up into multiple bios if appropriate. Neither does the function wait for completion -- it's expected that callers won't care about when, or even _if_, the request completes. It's only a hint to the device anyway. By definition, the file system doesn't _care_ about these sectors any more. [With feedback from OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> and Jens Axboe <jens.axboe@oracle.com] Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: add blk_queue_update_dma_pad	FUJITA Tomonori	2008-07-04	1	-4/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds blk_queue_update_dma_pad to prevent LLDs from overwriting the dma pad mask wrongly (we added blk_queue_update_dma_alignment due to the same reason). This also converts libata to use blk_queue_update_dma_pad instead of blk_queue_dma_pad. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <htejun@gmail.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	Remove blkdev warning triggered by using md	Neil Brown	2008-05-15	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As setting and clearing queue flags now requires that we hold a spinlock on the queue, and as blk_queue_stack_limits is called without that lock, get the lock inside blk_queue_stack_limits. For blk_queue_stack_limits to be able to find the right lock, each md personality needs to set q->queue_lock to point to the appropriate lock. Those personalities which didn't previously use a spin_lock, us q->__queue_lock. So always initialise that lock when allocated. With this in place, setting/clearing of the QUEUE_FLAG_PLUGGED bit will no longer cause warnings as it will be clear that the proper lock is held. Thanks to Dan Williams for review and fixing the silly bugs. Signed-off-by: NeilBrown <neilb@suse.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Alistair John Strachan <alistair@devzero.co.uk> Cc: Nick Piggin <npiggin@suse.de> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Jacek Luczak <difrost.kernel@gmail.com> Cc: Prakash Punnoor <prakash@punnoor.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	block: remove remaining __FUNCTION__ occurrences	Harvey Harrison	2008-05-01	1	-10/+10
\| \| \| \| \| \| \| \| \|	__FUNCTION__ is gcc specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	block: make queue flags non-atomic	Nick Piggin	2008-04-29	1	-1/+1
\| \| \| \| \| \| \|	We can save some atomic ops in the IO path, if we clearly define the rules of how to modify the queue flags. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	unexport blk_max_pfn	Adrian Bunk	2008-04-29	1	-1/+0
\| \| \| \| \| \| \|	blk_max_pfn can now be unexported. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	Fix bounce setting for 64-bit	Andrea Arcangeli	2008-04-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Looking a bit closer into this regression the reason this can't be right is that dma_addr common default is BLK_BOUNCE_HIGH and most machines have less than 4G. So if you do: if (b_pfn <= (min_t(u64, 0xffffffff, BLK_BOUNCE_HIGH) >> PAGE_SHIFT)) dma = 1 that will translate to: if (BLK_BOUNCE_HIGH <= BLK_BOUNCE_HIGH) dma = 1 So for 99% of hardware this will trigger unnecessary GFP_DMA allocations and isa pooling operations. Also note how the 32bit code still does b_pfn < blk_max_low_pfn. I guess this is what you were looking after. I didn't verify but as far as I can tell, this will stop the regression with isa dma operations at boot for 99% of blkdev/memory combinations out there and I guess this fixes the setups with >4G of ram and 32bit pci cards as well (this also retains symmetry with the 32bit code). Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	block: remove extern on function definition	Harvey Harrison	2008-03-04	1	-1/+1
\| \| \| \| \| \| \| \|	Intoduced between 2.6.25-rc2 and -rc3 block/blk-settings.c:319:12: warning: function 'blk_queue_dma_drain' with external linkage has definition Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
*	Fix DMA access of block device in 64-bit kernel on some non-x86 systems with ↵	Yang Shi	2008-03-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	4GB or upper 4GB memory For some non-x86 systems with 4GB or upper 4GB memory, we need increase the range of addresses that can be used for direct DMA in 64-bit kernel. Signed-off-by: Yang Shi <yang.shi@windriver.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>