summaryrefslogtreecommitdiffstats
path: root/drivers/dma/iop-adma.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'dmaengine' into async-tx-nextDan Williams2009-09-091-4/+5
|\ | | | | | | | | | | | | | | Conflicts: crypto/async_tx/async_xor.c drivers/dma/ioat/dma_v2.h drivers/dma/ioat/pci.c drivers/md/raid5.c
| * iop-adma: implement a private tx_listDan Williams2009-09-091-4/+5
| | | | | | | | | | | | | | | | Drop iop-adma's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | Merge branch 'iop-raid6' into async-tx-nextDan Williams2009-09-091-44/+393
|\ \
| * | iop-adma: P+Q self testDan Williams2009-08-301-1/+181
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even though the intent is to extend dmatest with P+Q tests there is still value in having an always-on sanity check to prevent an unintentionally broken driver from registering. This depends on raid6_pq.ko for verification, the side effect being that PQ capable channels will fail to register when raid6 is disabled. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * | iop-adma: P+Q support for iop13xx adma enginesDan Williams2009-08-301-34/+197
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iop33x support is not included because that engine is a bit more awkward to handle in that it can either be in xor mode or pq mode. The dmaengine/async_tx layers currently only comprehend static capabilities. Note iop13xx does not support hardware PQ continuation so the driver must handle the DMA_PREP_CONTINUE flag for operations across > 16 sources. From the comment for dma_maxpq: /* When an engine does not support native continuation we need 3 extra * source slots to reuse P and Q with the following coefficients: * 1/ {00} * P : remove P from Q', but use it as a source for P' * 2/ {01} * Q : use Q to continue Q' calculation * 3/ {00} * Q : subtract Q from P' to cancel (2) */ Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * | iop-adma: fix lockdep false positiveDan Williams2009-08-301-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | lockdep correctly identifies a potential recursive locking case for iop_chan->lock, but in the dependency submission case we expect that the same class will be acquired for both the parent dependency and the child channel. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * | iop-adma: cleanup iop_adma_run_tx_complete_actionsDan Williams2009-08-301-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace 'desc->async_tx.' with 'tx->' [ Impact: pure cleanup ] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | | dmaengine: cleanup unused transaction typesDan Williams2009-09-091-4/+1
|/ / | | | | | | | | | | | | No drivers currently implement these operation types, so they can be deleted. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | async_tx: add support for asynchronous GF multiplicationDan Williams2009-08-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Based on an original patch by Yuri Tikhonov ] This adds support for doing asynchronous GF multiplication by adding two additional functions to the async_tx API: async_gen_syndrome() does simultaneous XOR and Galois field multiplication of sources. async_syndrome_val() validates the given source buffers against known P and Q values. When a request is made to run async_pq against more than the hardware maximum number of supported sources we need to reuse the previous generated P and Q values as sources into the next operation. Care must be taken to remove Q from P' and P from Q'. For example to perform a 5 source pq op with hardware that only supports 4 sources at a time the following approach is taken: p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08})) p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10})) p' = p + q + q + src4 = p + src4 q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4 Note: 4 is the minimum acceptable maxpq otherwise we punt to synchronous-software path. The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as sources (in the above manner) and fill the remaining slots up to maxpq with the new sources/coefficients. Note1: Some devices have native support for P+Q continuation and can skip this extra work. Devices with this capability can advertise it with dma_set_maxpq. It is up to each driver how to handle the DMA_PREP_CONTINUE flag. Note2: The api supports disabling the generation of P when generating Q, this is ignored by the synchronous path but is implemented by some dma devices to save unnecessary writes. In this case the continuation algorithm is simplified to only reuse Q as a source. Cc: H. Peter Anvin <hpa@zytor.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Yuri Tikhonov <yur@emcraft.com> Signed-off-by: Ilya Yanok <yanok@emcraft.com> Reviewed-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | async_tx: rename zero_sum to valDan Williams2009-04-081-19/+19
|/ | | | | | | | | | | | | 'zero_sum' does not properly describe the operation of generating parity and checking that it validates against an existing buffer. Change the name of the operation to 'val' (for 'validate'). This is in anticipation of the p+q case where it is a requirement to identify the target parity buffers separately from the source buffers, because the target parity buffers will not have corresponding pq coefficients. Reviewed-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* dmaengine: initialize tx_list in dma_async_tx_descriptor_initDan Williams2009-03-251-1/+0
| | | | | | | | Centralize this common initialization (and one case where ipu_idmac is duplicating ->chan initialization). Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* Merge branch 'fixes' of ↵Linus Torvalds2009-03-081-8/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: dmatest: fix use after free in dmatest_exit ipu_idmac: fix spinlock type iop-adma, mv_xor: fix mem leak on self-test setup failure fsldma: fix off by one in dma_halt I/OAT: fail self-test if callback test reaches timeout I/OAT: update driver version and copyright dates I/OAT: list usage cleanup I/OAT: set tcp_dma_copybreak to 256k for I/OAT ver.3 I/OAT: cancel watchdog before dma remove I/OAT: fail initialization on zero channels detection I/OAT: do not set DCACTRL_CMPL_WRITE_ENABLE for I/OAT ver.3 I/OAT: add verification for proper APICID_TAG_MAP setting by BIOS dmaengine: update kerneldoc
| * iop-adma, mv_xor: fix mem leak on self-test setup failureRoel Kluin2009-03-051-8/+8
| | | | | | | | | | | | | | | | | | iop_adma_zero_sum_self_test has the brackets in the wrong place for the setup failure deallocation path. This error was duplicated in mv_xor_xor_self_test. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | [ARM] fix lots of ARM __devexit sillynessRussell King2009-03-031-1/+1
|/ | | | | | | | | | | | `iop_adma_remove' referenced in section `.data' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o `mv_xor_remove' referenced in section `.data' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o `mv64xxx_i2c_unmap_regs' referenced in section `.devinit.text' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o `mv64xxx_i2c_remove' referenced in section `.data' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o `orion_nand_remove' referenced in section `.data' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o `pxafb_remove' referenced in section `.data' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* iop-adma: enable module removalDan Williams2009-01-061-4/+0
| | | | | | Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* iop-adma: kill debug BUG_ONDan Williams2009-01-061-2/+0
| | | | | | | | | This BUG_ON caught problems in early development but now it is in the way as it invalidly triggers when trying to remove the module. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* iop-adma: let devm do its job, don't duplicate freeDan Williams2009-01-061-13/+0
| | | | | | | | No need to free stuff that the devm infrastructure will take care of... Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* dmaengine: remove 'bigref' infrastructureDan Williams2009-01-061-1/+0
| | | | | | | | | | Reference counting is done at the module level so clients need not worry that a channel will leave while they are actively using dmaengine. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* dmaengine: kill struct dma_client and supporting infrastructureDan Williams2009-01-061-4/+3
| | | | | | | | | | | | All users have been converted to either the general-purpose allocator, dma_find_channel, or dma_request_channel. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* dmaengine: remove dependency on async_txDan Williams2009-01-061-2/+1
| | | | | | | | | | | | | | async_tx.ko is a consumer of dma channels. A circular dependency arises if modules in drivers/dma rely on common code in async_tx.ko. It prevents either module from being unloaded. Move dma_wait_for_async_tx and async_tx_run_dependencies to dmaeninge.o where they should have been from the beginning. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* async_xor: dma_map destination DMA_BIDIRECTIONALDan Williams2008-12-081-3/+13
| | | | | | | | | | | | | | Mapping the destination multiple times is a misuse of the dma-api. Since the destination may be reused as a source, ensure that it is only mapped once and that it is mapped bidirectionally. This appears to add ugliness on the unmap side in that it always reads back the destination address from the descriptor, but gcc can determine that dma_unmap is a nop and not emit the code that calculates its arguments. Cc: <stable@kernel.org> Cc: Saeed Bishara <saeed@marvell.com> Acked-by: Yuri Tikhonov <yur@emcraft.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* iop-adma: use iop_paranoia() for debug BUG_ONsDan Williams2008-11-111-1/+1
| | | | | | | | Now that the critical read back to flush the next descriptor address is fixed we can downgrade some BUG_ONs that need only be enabled when testing changes to the driver. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* iop-adma: add a dummy read to flush next descriptor updateDan Williams2008-11-111-4/+5
| | | | | | | | | | | The current dummy read references the wrong address allowing the next descriptor address update to linger in the store buffer and get passed by an 'append' event. This issue was uncovered by the change from strongly-ordered to device memory for the adma registers. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* [ARM] Move include/asm-arm/arch-* to arch/arm/*/include/machRussell King2008-08-071-1/+1
| | | | | | This just leaves include/asm-arm/plat-* to deal with. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* iop_adma: document how to calculate the minimum descriptor pool sizeDan Williams2008-07-181-1/+10
| | | | Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* iop_adma: directly reclaim descriptors on allocation failureDan Williams2008-07-181-2/+2
| | | | | | | | Force callers that trigger an "out of descriptors" condition to run the cleanup loop directly. Alleviates the requirement to have soft-irqs enabled when polling for a descriptor in async_xor. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* dmaengine: add DMA_COMPL_SKIP_{SRC,DEST}_UNMAP flags to control dma unmapDan Williams2008-07-081-11/+18
| | | | | | | | | | | | | | In some cases client code may need the dma-driver to skip the unmap of source and/or destination buffers. Setting these flags indicates to the driver to skip the unmap step. In this regard async_xor is currently broken in that it allows the destination buffer to be unmapped while an operation is still in progress, i.e. when the number of sources exceeds the hardware channel's maximum (fixed in a subsequent patch). Acked-by: Saeed Bishara <saeed@marvell.com> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* dmaengine: Add dma_client parameter to device_alloc_chan_resourcesHaavard Skinnemoen2008-07-081-3/+4
| | | | | | | | | | | | | | | | A DMA controller capable of doing slave transfers may need to know a few things about the slave when preparing the channel. We don't want to add this information to struct dma_channel since the channel hasn't yet been bound to a client at this point. Instead, pass a reference to the client requesting the channel to the driver's device_alloc_chan_resources hook so that it can pick the necessary information from the dma_client struct by itself. [dan.j.williams@intel.com: fixed up fsldma and mv_xor] Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* iop-adma: fix platform driver hotplug/coldplugKay Sievers2008-07-081-0/+2
| | | | | | | | | | | | Since 43cc71eed1250755986da4c0f9898f9a635cb3bf, the platform modalias is prefixed with "platform:". Add MODULE_ALIAS() to most of the hotpluggable platform drivers, to re-enable auto loading. Cc: <stable@kernel.org> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* iop-adma: fixup some kzalloc/memset confusionsChristophe Jaillet2008-05-201-4/+2
| | | | | | | | | | | | | | 1) Remove an explicit memset(.., 0, ...) to a variable allocated with kzalloc (i.e. 'dest'). 2) Allocate 'src' with kmalloc instead of kzalloc as all elements of the 'src' buffer are initialized in a 'for(...)' loop just after. 3) remove useless 'sizeof(u8)', which always returns 1, when computing the size of the memory to be allocated. Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* dmaengine: ack to flags: make use of the unused bits in the 'ack' fieldDan Williams2008-04-171-18/+21
| | | | | | | | | | | | | | | | | | 'ack' is currently a simple integer that flags whether or not a client is done touching fields in the given descriptor. It is effectively just a single bit of information. Converting this to a flags parameter allows the other bits to be put to use to control completion actions, like dma-unmap, and capture results, like xor-zero-sum == 0. Changes are one of: 1/ convert all open-coded ->ack manipulations to use async_tx_ack and async_tx_test_ack. 2/ set the ack bit at prep time where possible 3/ make drivers store the flags at prep time 4/ add flags to the device_prep_dma_interrupt prototype Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* iop-adma: remove the workaround for missed interrupts on iop3xxDan Williams2008-04-171-5/+0
| | | | | | This workaround was covering the dependency submission bug in async_tx. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* async_tx: kill ->device_dependency_addedDan Williams2008-04-171-7/+0
| | | | | | | | | | DMA drivers no longer need to be notified of dependency submission events as async_tx_run_dependencies and async_tx_channel_switch will handle the scheduling and execution of dependent operations. [sfr@canb.auug.org.au: extend this for fsldma] Acked-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* async_tx: fix multiple dependency submissionDan Williams2008-04-171-4/+5
| | | | | | | | | | Shrink struct dma_async_tx_descriptor and introduce async_tx_channel_switch to properly inject a channel switch interrupt in the descriptor stream. This simplifies the locking model as drivers no longer need to handle dma_async_tx_descriptor.lock. Acked-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* iop-adma.c: replace remaining __FUNCTION__ occurrencesHarvey Harrison2008-03-131-16/+16
| | | | | | | __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* async_tx: replace 'int_en' with operation preparation flagsDan Williams2008-02-061-10/+10
| | | | | | | | | | | Pass a full set of flags to drivers' per-operation 'prep' routines. Currently the only flag passed is DMA_PREP_INTERRUPT. The expectation is that arch-specific async_tx_find_channel() implementations can exploit this capability to find the best channel for an operation. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Shannon Nelson <shannon.nelson@intel.com> Reviewed-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
* async_tx: kill tx_set_src and tx_set_dest methodsDan Williams2008-02-061-81/+43
| | | | | | | | | | | | | | | | | | | | | | The tx_set_src and tx_set_dest methods were originally implemented to allow an array of addresses to be passed down from async_xor to the dmaengine driver while minimizing stack overhead. Removing these methods allows drivers to have all transaction parameters available at 'prep' time, saves two function pointers in struct dma_async_tx_descriptor, and reduces the number of indirect branches.. A consequence of moving this data to the 'prep' routine is that multi-source routines like async_xor need temporary storage to convert an array of linear addresses into an array of dma addresses. In order to keep the same stack footprint of the previous implementation the input array is reused as storage for the dma addresses. This requires that sizeof(dma_addr_t) be less than or equal to sizeof(void *). As a consequence CONFIG_DMADEVICES now depends on !CONFIG_HIGHMEM64G. It also requires that drivers be able to make descriptor resources available when the 'prep' routine is polled. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Shannon Nelson <shannon.nelson@intel.com>
* iop-adma: use LIST_HEAD instead of LIST_HEAD_INITDenis Cheng2008-02-061-1/+1
| | | | | | | | these three list_head are all local variables, but can also use LIST_HEAD. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* Remove "unsafe" from module structRusty Russell2007-10-171-5/+4
| | | | | | | | | | | | | | | | | | | | Adrian Bunk points out that "unsafe" was used to mark modules touched by the deprecated MOD_INC_USE_COUNT interface, which has long gone. It's time to remove the member from the module structure, as well. If you want a module which can't unload, don't register an exit function. (Vlad Yasevich says SCTP is now safe to unload, so just remove the __unsafe there). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Shannon Nelson <shannon.nelson@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Cc: Sridhar Samudrala <sri@us.ibm.com> Cc: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* dmaengine: driver for the iop32x, iop33x, and iop13xx raid enginesDan Williams2007-07-131-0/+1467
The Intel(R) IOP series of i/o processors integrate an Xscale core with raid acceleration engines. The capabilities per platform are: iop219: (2) copy engines iop321: (2) copy engines (1) xor and block fill engine iop33x: (2) copy and crc32c engines (1) xor, xor zero sum, pq, pq zero sum, and block fill engine iop34x (iop13xx): (2) copy, crc32c, xor, xor zero sum, and block fill engines (1) copy, crc32c, xor, xor zero sum, pq, pq zero sum, and block fill engine The driver supports the features of the async_tx api: * asynchronous notification of operation completion * implicit (interupt triggered) handling of inter-channel transaction dependencies The driver adapts to the platform it is running by two methods. 1/ #include <asm/arch/adma.h> which defines the hardware specific iop_chan_* and iop_desc_* routines as a series of static inline functions 2/ The private platform data attached to the platform_device defines the capabilities of the channels 20070626: Callbacks are run in a tasklet. Given the recent discussion on LKML about killing tasklets in favor of workqueues I did a quick conversion of the driver. Raid5 resync performance dropped from 50MB/s to 30MB/s, so the tasklet implementation remains until a generic softirq interface is available. Changelog: * fixed a slot allocation bug in do_iop13xx_adma_xor that caused too few slots to be requested eventually leading to data corruption * enabled the slot allocation routine to attempt to free slots before returning -ENOMEM * switched the cleanup routine to solely use the software chain and the status register to determine if a descriptor is complete. This is necessary to support other IOP engines that do not have status writeback capability * make the driver iop generic * modified the allocation routines to understand allocating a group of slots for a single operation * added a null xor initialization operation for the xor only channel on iop3xx * support xor operations on buffers larger than the hardware maximum * split the do_* routines into separate prep, src/dest set, submit stages * added async_tx support (dependent operations initiation at cleanup time) * simplified group handling * added interrupt support (callbacks via tasklets) * brought the pending depth inline with ioat (i.e. 4 descriptors) * drop dma mapping methods, suggested by Chris Leech * don't use inline in C files, Adrian Bunk * remove static tasklet declarations * make iop_adma_alloc_slots easier to read and remove chances for a corrupted descriptor chain * fix locking bug in iop_adma_alloc_chan_resources, Benjamin Herrenschmidt * convert capabilities over to dma_cap_mask_t * fixup sparse warnings * add descriptor flush before iop_chan_enable * checkpatch.pl fixes * gpl v2 only correction * move set_src, set_dest, submit to async_tx methods * move group_list and phys to async_tx Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Dan Williams <dan.j.williams@intel.com>