linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	[IOAT]: Remove redundant struct member to avoid descriptor cache miss	Shannon Nelson	2007-08-15	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The layout for struct ioat_desc_sw is non-optimal and causes an extra cache hit for every descriptor processed. By tightening up the struct layout and removing one item, we pull in the fields that get used in the speedpath and get a little better performance. Before: ------- struct ioat_desc_sw { struct ioat_dma_descriptor * hw; /* 0 8 / struct list_head node; / 8 16 / int tx_cnt; / 24 4 / / XXX 4 bytes hole, try to pack / dma_addr_t src; / 32 8 / __u32 src_len; / 40 4 / / XXX 4 bytes hole, try to pack / dma_addr_t dst; / 48 8 / __u32 dst_len; / 56 4 / / XXX 4 bytes hole, try to pack / / --- cacheline 1 boundary (64 bytes) --- / struct dma_async_tx_descriptor async_tx; / 64 144 / / --- cacheline 3 boundary (192 bytes) was 16 bytes ago --- / / size: 208, cachelines: 4 / / sum members: 196, holes: 3, sum holes: 12 / / last cacheline: 16 bytes / }; / definitions: 1 / After: ------ struct ioat_desc_sw { struct ioat_dma_descriptor hw; /* 0 8 / struct list_head node; / 8 16 / int tx_cnt; / 24 4 / __u32 len; / 28 4 / dma_addr_t src; / 32 8 / dma_addr_t dst; / 40 8 / struct dma_async_tx_descriptor async_tx; / 48 144 / / --- cacheline 3 boundary (192 bytes) --- / / size: 192, cachelines: 3 / }; / definitions: 1 */ Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	dmaengine: make clients responsible for managing channels	Dan Williams	2007-07-13	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current implementation assumes that a channel will only be used by one client at a time. In order to enable channel sharing the dmaengine core is changed to a model where clients subscribe to channel-available-events. Instead of tracking how many channels a client wants and how many it has received the core just broadcasts the available channels and lets the clients optionally take a reference. The core learns about the clients' needs at dma_event_callback time. In support of multiple operation types, clients can specify a capability mask to only be notified of channels that satisfy a certain set of capabilities. Changelog: * removed DMA_TX_ARRAY_INIT, no longer needed * dma_client_chan_free -> dma_chan_release: switch to global reference counting only at device unregistration time, before it was also happening at client unregistration time * clients now return dma_state_client to dmaengine (ack, dup, nak) * checkpatch.pl fixes * fixup merge with git-ioat Cc: Chris Leech <christopher.leech@intel.com> Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: David S. Miller <davem@davemloft.net>
*	dmaengine: refactor dmaengine around dma_async_tx_descriptor	Dan Williams	2007-07-13	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current dmaengine interface defines mutliple routines per operation, i.e. dma_async_memcpy_buf_to_buf, dma_async_memcpy_buf_to_page etc. Adding more operation types (xor, crc, etc) to this model would result in an unmanageable number of method permutations. Are we really going to add a set of hooks for each DMA engine whizbang feature? - Jeff Garzik The descriptor creation process is refactored using the new common dma_async_tx_descriptor structure. Instead of per driver do_<operation>_<dest>_to_<src> methods, drivers integrate dma_async_tx_descriptor into their private software descriptor and then define a 'prep' routine per operation. The prep routine allocates a descriptor and ensures that the tx_set_src, tx_set_dest, tx_submit routines are valid. Descriptor creation and submission becomes: struct dma_device dev; struct dma_chan chan; struct dma_async_tx_descriptor tx; tx = dev->device_prep_dma_<operation>(chan, len, int_flag) tx->tx_set_src(dma_addr_t, tx, index / for multi-source ops /) tx->tx_set_dest(dma_addr_t, tx, index) tx->tx_submit(tx) In addition to the refactoring, dma_async_tx_descriptor also lays the groundwork for definining cross-channel-operation dependencies, and a callback facility for asynchronous notification of operation completion. Changelog: drop dma mapping methods, suggested by Chris Leech * fix ioat_dma_dependency_added, also caught by Andrew Morton * fix dma_sync_wait, change from Andrew Morton * uninline large functions, change from Andrew Morton * add tx->callback = NULL to dmaengine calls to interoperate with async_tx calls * hookup ioat_tx_submit * convert channel capabilities to a 'cpumask_t like' bitmap * removed DMA_TX_ARRAY_INIT, no longer needed * checkpatch.pl fixes * make set_src, set_dest, and tx_submit descriptor specific methods * fixup git-ioat merge * move group_list and phys to dma_async_tx_descriptor Cc: Jeff Garzik <jeff@garzik.org> Cc: Chris Leech <christopher.leech@intel.com> Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: David S. Miller <davem@davemloft.net>
*	[PATCH] drivers/dma trivial annotations	Al Viro	2006-10-11	1	-2/+2
\| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[I/OAT]: Move PCI_DEVICE_ID_INTEL_IOAT to linux/pci_ids.h	David S. Miller	2006-06-18	1	-2/+1
\| \| \| \|	Signed-off-by: David S. Miller <davem@davemloft.net>
*	[I/OAT]: Driver for the Intel(R) I/OAT DMA engine	Chris Leech	2006-06-18	1	-0/+126
	Adds a new ioatdma driver Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>