summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* dmanegine: ioatdma: remove function ptrs in ioatdma_deviceDave Jiang2015-08-173-32/+13
| | | | | | | | | Since we are a "single" device driver now we no longer require the function pointers in ioatdma_device. Remove. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: removal of dma_v3.c and relevant ioat3 referencesDave Jiang2015-08-176-617/+487
| | | | | | | | | Moving the relevant functions to their respective .c files and removal of dma_v3.c file. Also removed various ioat3 references when appropriate. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: move dma prep functions to single locationDave Jiang2015-08-176-749/+769
| | | | | | | | | Move all DMA descriptor prepping functions to prep.c file. Fixup all broken bits caused by the move. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: move all the init routinesDave Jiang2015-08-176-1372/+1375
| | | | | | | | | Moving all the init routines to init.c and fixup anything broken during the move. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: move all sysfs related codeDave Jiang2015-08-173-109/+136
| | | | | | | | Move and fixup all sysfs related bits to sysfs.c file. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: remove dma_v2.*Dave Jiang2015-08-178-889/+816
| | | | | | | | | Clean out dma_v2 and remove ioat2 calls since we are moving everything to just ioat. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: fixup ioatdma_device namingsDave Jiang2015-08-175-191/+195
| | | | | | | | | | | | Changing the variable names for ioatdma_device to be consistently named ioat_dma instead of device/dma in order to avoid confusion and distinct from struct device. This will clearly indicate that it is an ioatdma_device. This also make all the naming consistent that the dma device is ioat_dma and all the channels are ioat_chan. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: clean up local dma channel data structureDave Jiang2015-08-175-475/+470
| | | | | | | | | | | Kill the common ioatdma channel structure and everything that is not dma_chan to be ioat_dma_chan. Since we don't have to worry about v1 and v2 ioatdma anymore this makes it much cleaner and obvious for maintenance. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: remove ioatdma v2 registrationDave Jiang2015-08-175-428/+1
| | | | | | | | Removal of support for ioatdma v2 device support. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: remove ioat1 specific codeDave Jiang2015-08-174-892/+2
| | | | | | | | Cleaning up of ioat1 specific code as it is no longer supported Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: deprecating and removal of old ioatdma devicesDave Jiang2015-08-172-14/+0
| | | | | | | | | | Removal of any devices that are ioatdma pre-3.0. This is the first step in attempting to clean up the ioatdma driver and remove hw no longer supported. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: fix u16 overflow in cleanupAllen Hubbe2015-08-171-1/+1
| | | | | | | | | | | | If the allocation order is 16, then the u16 count will overflow and wrap to zero when assigned the value 1 << 16. Change the type of 'total_descs' to int, so that it is large enough to store a value equal or greater than 1 << 16. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: fix u16 overflow in reshapeAllen Hubbe2015-08-171-1/+1
| | | | | | | | | | | | | | If the allocation order is 16, then the u16 index will overflow and wrap to zero instead of being equal or greater than 1 << 16. The loop condition will always be true, and the loop will run until all the memory resources are depleted. Change the type of index 'i' to u32, so that it is large enough to store a value equal or greater than 1 << 16. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ipu: Consolidate duplicated irq handlersThomas Gleixner2015-08-061-42/+4
| | | | | | | | | | | The functions irq_irq_err and ipu_irq_fn are identical plus/minus the comments. Remove one. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Vinod Koul <vinod.koul@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: dmaengine@vger.kernel.org Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ipu: Prepare irq handlers for irq argument removalThomas Gleixner2015-08-061-2/+4
| | | | | | | | | | | | | | | | The irq argument of most interrupt flow handlers is unused or merily used instead of a local variable. The handlers which need the irq argument can retrieve the irq number from the irq descriptor. Search and update was done with coccinelle and the invaluable help of Julia Lawall. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Vinod Koul <vinod.koul@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: dmaengine@vger.kernel.org Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: xdmac: Add scatter gathered memset supportMaxime Ripard2015-08-061-1/+165
| | | | | | | | | The XDMAC also supports memset operations over discontiguous areas. Add the necessary logic to support this. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Acked-by: Ludovic Desroches <ludovic.desroches@atmel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: Add scatter-gathered memsetMaxime Ripard2015-08-061-0/+5
| | | | | | | | | | | | | | | | | The current API allows the driver to accelerate memset by using the DMA controller. However, it does so over a contiguous memory area, which might proves inefficient when you have to do it over a non-contiguous yet repititive pattern, since you have to create a number of descriptors and then submit each other. Add a memset operation going over a scatter list to handle such cases in a single call. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Acked-by: Ludovic Desroches <ludovic.desroches@atmel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: Add an enum for the dmaengine alignment constraintsMaxime Ripard2015-08-0512-21/+32
| | | | | | | | | | | | | | Most drivers need to set constraints on the buffer alignment for async tx operations. However, even though it is documented, some drivers either use a defined constant that is not matching what the alignment variable expects (like DMA_BUSWIDTH_* constants) or fill the alignment in bytes instead of power of two. Add a new enum for these alignments that matches what the framework expects, and convert the drivers to it. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: shdma: Make dummy shdma_chan_filter() always return falseGeert Uytterhoeven2015-08-051-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If CONFIG_SH_DMAE_BASE (which is required for DMA engine support for legacy SH, SH/R-Mobile, and R-Car Gen1, but not for R-Car Gen2) is not enabled, but CONFIG_RCAR_DMAC (for R-Car Gen2 DMA engine support) is, and the DTS doesn't provide a "dmas" property for a device, dma_request_slave_channel_compat() incorrectly succeeds, and returns a DMA channel. However, when trying to use that DMA channel later, it fails with: rcar-dmac e6700000.dma-controller: rcar_dmac_prep_slave_sg: bad parameter: len=1, id=-22 (Fortunately most drivers can handle this failure, and fall back to PIO) The reason for this is that a NULL legacy filter function is used, which actually means "all channels are OK", not "do not match". If CONFIG_SH_DMAE_BASE is enabled (like in shmobile_defconfig, which supports other SoCs besides R-Car Gen2), shdma_chan_filter() correctly returns false, as no available channel on R-Car Gen2 matches a shdma-base channel. If the DTS does provide a "dmas" property, dma_request_slave_channel() succeeds, and legacy filter-based matching is not used. To fix this, change shdma_chan_filter from being NULL to a dummy function that always returns false, like is done on other platforms. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: imx-sdma: Add device to device supportShengjiu Wang2015-08-051-12/+140
| | | | | | | | | | | This patch adds DEV_TO_DEV support for i.MX SDMA driver to support data transfer between two peripheral FIFOs. The per_2_per script requires two peripheral addresses and two DMA requests, and it need to check the src addr and dst addr is in the SPBA bus space or in the AIPS bus space. Signed-off-by: Shengjiu Wang <shengjiu.wang@freescale.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ipu: Use irq_desc_get_xxx() to avoid redundant lookup of irq_descJiang Liu2015-07-162-3/+3
| | | | | | | | | | | | | | | Use irq_desc_get_xxx() to avoid redundant lookup of irq_desc while we already have a pointer to corresponding irq_desc. This is also a preparation for the removal of the 'irq' argument from interrupt flow handlers. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vinod Koul <vinod.koul@intel.com> Cc: dmaengine@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ipu: Consolidate chained IRQ handler install/removeThomas Gleixner2015-07-161-8/+4
| | | | | | | | | | | | | | | | Chained irq handlers usually set up handler data as well. We now have a function to set both under irq_desc->lock. Replace the two calls with one. Search and conversion was done with coccinelle. Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vinod Koul <vinod.koul@intel.com> Cc: dmaengine@vger.kernel.org Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: hsu: remove excessive lockAndy Shevchenko2015-07-162-36/+4
| | | | | | | | | All hardware accesses are done under virtual channel lock. That's why specific channel lock is excessive and can be removed safely. This has been tested on Intel Medfield and Merrifield. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: Ignore IOAT devices under hotplug-capable PCI host bridgeJiang Liu2015-07-161-0/+34
| | | | | | | | | | | | | | | | | | | | | The dmaengine core assumes that async DMA devices will only be removed when they not used anymore, or it assumes dma_async_device_unregister() will only be called by dma driver exit routines. But this assumption is not true for the IOAT driver, which calls dma_async_device_unregister() from ioat_remove(). So current IOAT driver doesn't support device hot-removal because it may cause system crash to hot-remove an inuse IOAT device. To support CPU socket hot-removal, all PCI devices, including IOAT devices embedded in the socket, will be hot-removed. The idea solution is to enhance the dmaengine core and IOAT driver to support hot-removal, but that's too hard. This patch implements a hack to disable IOAT devices under hotplug-capable CPU socket so it won't break socket hot-removal. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: ioatdma: Set non RAID channels to be private capableDave Jiang2015-07-071-0/+3
| | | | | | | | | | This allows claiming of non-RAID channels as a private channel. This prevents breakage of MDRAID using the IOATDMA channels via async_tx but also allows agents such as NTB to claim channels exclusively for its usages. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: Use Pointer xt after NULL check.Maninder Singh2015-07-071-3/+3
| | | | | | | | | | | | | Removing static analysis error:- Possible null pointer dereference: xt Because currently xt is dereferenced before NULL check, Thus Use it after NULL Check. Signed-off-by: Maninder Singh <maninder1.s@samsung.com> Reviewed-by: Vaneet Narang <v.narang@samsung.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: imx-dma: Check for clk_prepare_enable() errorFabio Estevam2015-07-071-9/+14
| | | | | | | | | | clk_prepare_enable() may fail, so we should better check its return value and propagate it in the case of error. While at it, change the label 'err' to a more descriptive naming. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* dmaengine: Remove remaining FSF mailing addressesJarkko Nikula2015-07-062-8/+0
| | | | | | | | Commit 3b62286d0ef7 ("dmaengine: Remove FSF mailing addresses") left Free Software Foundation mailing address still in two files. Remove it now. Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* Linux 4.2-rc1v4.2-rc1Linus Torvalds2015-07-051-2/+2
|
* Merge tag 'platform-drivers-x86-v4.2-2' of ↵Linus Torvalds2015-07-057-45/+1004
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.infradead.org/users/dvhart/linux-platform-drivers-x86 Pull late x86 platform driver updates from Darren Hart: "The following came in a bit later and I wanted them to bake in next a few more days before submitting, thus the second pull. A new intel_pmc_ipc driver, a symmetrical allocation and free fix in dell-laptop, a couple minor fixes, and some updated documentation in the dell-laptop comments. intel_pmc_ipc: - Add Intel Apollo Lake PMC IPC driver tc1100-wmi: - Delete an unnecessary check before the function call "kfree" dell-laptop: - Fix allocating & freeing SMI buffer page - Show info about WiGig and UWB in debugfs - Update information about wireless control" * tag 'platform-drivers-x86-v4.2-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86: intel_pmc_ipc: Add Intel Apollo Lake PMC IPC driver tc1100-wmi: Delete an unnecessary check before the function call "kfree" dell-laptop: Fix allocating & freeing SMI buffer page dell-laptop: Show info about WiGig and UWB in debugfs dell-laptop: Update information about wireless control
| * intel_pmc_ipc: Add Intel Apollo Lake PMC IPC driverqipeng.zha2015-06-305-0/+864
| | | | | | | | | | | | | | | | | | | | | | This driver provides support for PMC control on Apollo Lake platforms. The PMC is an ARC processor which defines some IPC commands for communication with other entities in the CPU. Signed-off-by: qipeng.zha <qipeng.zha@intel.com> [fengguang.wu@intel.com: Fix Sparse and Cocinelle warnings] Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Darren Hart <dvhart@linux.intel.com>
| * tc1100-wmi: Delete an unnecessary check before the function call "kfree"Markus Elfring2015-06-271-1/+1
| | | | | | | | | | | | | | | | | | | | The kfree() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Darren Hart <dvhart@linux.intel.com>
| * dell-laptop: Fix allocating & freeing SMI buffer pagePali Rohár2015-06-251-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit fix kernel crash when probing for rfkill devices in dell-laptop driver failed. Function free_page() was incorrectly used on struct page * instead of virtual address of SMI buffer. This commit also simplify allocating page for SMI buffer by using __get_free_page() function instead of sequential call of functions alloc_page() and page_address(). Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Darren Hart <dvhart@linux.intel.com>
| * dell-laptop: Show info about WiGig and UWB in debugfsPali Rohár2015-06-221-0/+17
| | | | | | | | | | | | | | | | This commit show additional information about rfkill state in debugfs based on newly released documentation by Dell. Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Darren Hart <dvhart@linux.intel.com>
| * dell-laptop: Update information about wireless controlPali Rohár2015-06-221-39/+119
| | | | | | | | | | | | | | | | Make sure that all existing SMBIOS calls for wireless control are properly documented. This commit also add new documentation released by Dell. Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Darren Hart <dvhart@linux.intel.com>
* | Merge branch 'for-linus' of ↵Linus Torvalds2015-07-0599-553/+784
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: "Assorted VFS fixes and related cleanups (IMO the most interesting in that part are f_path-related things and Eric's descriptor-related stuff). UFS regression fixes (it got broken last cycle). 9P fixes. fs-cache series, DAX patches, Jan's file_remove_suid() work" [ I'd say this is much more than "fixes and related cleanups". The file_table locking rule change by Eric Dumazet is a rather big and fundamental update even if the patch isn't huge. - Linus ] * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits) 9p: cope with bogus responses from server in p9_client_{read,write} p9_client_write(): avoid double p9_free_req() 9p: forgetting to cancel request on interrupted zero-copy RPC dax: bdev_direct_access() may sleep block: Add support for DAX reads/writes to block devices dax: Use copy_from_iter_nocache dax: Add block size note to documentation fs/file.c: __fget() and dup2() atomicity rules fs/file.c: don't acquire files->file_lock in fd_install() fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation vfs: avoid creation of inode number 0 in get_next_ino namei: make set_root_rcu() return void make simple_positive() public ufs: use dir_pages instead of ufs_dir_pages() pagemap.h: move dir_pages() over there remove the pointless include of lglock.h fs: cleanup slight list_entry abuse xfs: Correctly lock inode when removing suid and file capabilities fs: Call security_ops->inode_killpriv on truncate fs: Provide function telling whether file_remove_privs() will do anything ...
| * | 9p: cope with bogus responses from server in p9_client_{read,write}Al Viro2015-07-041-0/+8
| | | | | | | | | | | | | | | | | | if server claims to have written/read more than we'd told it to, warn and cap the claimed byte count to avoid advancing more than we are ready to.
| * | p9_client_write(): avoid double p9_free_req()Al Viro2015-07-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Braino in "9p: switch p9_client_write() to passing it struct iov_iter *"; if response is impossible to parse and we discard the request, get the out of the loop right there. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | 9p: forgetting to cancel request on interrupted zero-copy RPCAl Viro2015-07-041-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we'd already sent a request and decide to abort it, we *must* issue TFLUSH properly and not just blindly reuse the tag, or we'll get seriously screwed when response eventually arrives and we confuse it for response to later request that had reused the same tag. Cc: stable@vger.kernel.org # v3.2 and later Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | dax: bdev_direct_access() may sleepMatthew Wilcox2015-07-041-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The brd driver is the only in-tree driver that may sleep currently. After some discussion on linux-fsdevel, we decided that any driver may choose to sleep in its ->direct_access method. To ensure that all callers of bdev_direct_access() are prepared for this, add a call to might_sleep(). Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | block: Add support for DAX reads/writes to block devicesMatthew Wilcox2015-07-042-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a block device supports the ->direct_access methods, bypass the normal DIO path and use DAX to go straight to memcpy() instead of allocating a DIO and a BIO. Includes support for the DIO_SKIP_DIO_COUNT flag in DAX, as is done in do_blockdev_direct_IO(). Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | dax: Use copy_from_iter_nocacheMatthew Wilcox2015-07-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | When userspace does a write, there's no need for the written data to pollute the CPU cache. This matches the original XIP code. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | dax: Add block size note to documentationMatthew Wilcox2015-07-041-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | For block devices which are small enough, mkfs will default to creating a filesystem with block sizes smaller than page size. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | fs/file.c: __fget() and dup2() atomicity rulesEric Dumazet2015-07-011-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __fget() does lockless fetch of pointer from the descriptor table, attempts to grab a reference and treats "it was already zero" as "it's already gone from the table, we just hadn't seen the store, let's fail". Unfortunately, that breaks the atomicity of dup2() - __fget() might see the old pointer, notice that it's been already dropped and treat that as "it's closed". What we should be getting is either the old file or new one, depending whether we come before or after dup2(). Dmitry had following test failing sometimes : int fd; void *Thread(void *x) { char buf; int n = read(fd, &buf, 1); if (n != 1) exit(printf("read failed: n=%d errno=%d\n", n, errno)); return 0; } int main() { fd = open("/dev/urandom", O_RDONLY); int fd2 = open("/dev/urandom", O_RDONLY); if (fd == -1 || fd2 == -1) exit(printf("open failed\n")); pthread_t th; pthread_create(&th, 0, Thread, 0); if (dup2(fd2, fd) == -1) exit(printf("dup2 failed\n")); pthread_join(th, 0); if (close(fd) == -1) exit(printf("close failed\n")); if (close(fd2) == -1) exit(printf("close failed\n")); printf("DONE\n"); return 0; } Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | fs/file.c: don't acquire files->file_lock in fd_install()Eric Dumazet2015-07-013-19/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mateusz Guzik reported : Currently obtaining a new file descriptor results in locking fdtable twice - once in order to reserve a slot and second time to fill it. Holding the spinlock in __fd_install() is needed in case a resize is done, or to prevent a resize. Mateusz provided an RFC patch and a micro benchmark : http://people.redhat.com/~mguzik/pipebench.c A resize is an unlikely operation in a process lifetime, as table size is at least doubled at every resize. We can use RCU instead of the spinlock. __fd_install() must wait if a resize is in progress. The resize must block new __fd_install() callers from starting, and wait that ongoing install are finished (synchronize_sched()) resize should be attempted by a single thread to not waste resources. rcu_sched variant is used, as __fd_install() and expand_fdtable() run from process context. It gives us a ~30% speedup using pipebench on a dual Intel(R) Xeon(R) CPU E5-2696 v2 @ 2.50GHz Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Mateusz Guzik <mguzik@redhat.com> Acked-by: Mateusz Guzik <mguzik@redhat.com> Tested-by: Mateusz Guzik <mguzik@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper ↵Wang YanQing2015-07-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | limitation Execution of get_anon_bdev concurrently and preemptive kernel all could bring race condition, it isn't enough to check dev against its upper limitation with equality operator only. This patch fix it. Signed-off-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | vfs: avoid creation of inode number 0 in get_next_inoCarlos Maiolino2015-07-011-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | currently, get_next_ino() is able to create inodes with inode number = 0. This have a bad impact in the filesystems relying in this function to generate inode numbers. While there is no problem at all in having inodes with number 0, userspace tools which handle file management tasks can have problems handling these files, like for example, the impossiblity of users to delete these files, since glibc will ignore them. So, I believe the best way is kernel to avoid creating them. This problem has been raised previously, but the old thread didn't have any other update for a year+, and I've seen too many users hitting the same issue regarding the impossibility to delete files while using filesystems relying on this function. So, I'm starting the thread again, with the same patch that I believe is enough to address this problem. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | namei: make set_root_rcu() return voidAl Viro2015-06-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | The only caller that cares about its return value can just as easily pick it from nd->root_seq itself. We used to just calculate it and return to caller, but these days we are storing it in nd->root_seq in all cases. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | make simple_positive() publicAl Viro2015-06-2414-60/+25
| | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | ufs: use dir_pages instead of ufs_dir_pages()Fabian Frederick2015-06-241-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | dir_pages was declared in a lot of filesystems. Use newly dir_pages() from pagemap.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>