linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	dma: tegra: add clk_prepare/clk_unprepare	Prashant Gaikwad	2012-06-27	1	-2/+2
\| \| \| \| \| \| \| \|	Use clk_prepare/clk_unprepare as required by the generic clk framework. Signed-off-by: Prashant Gaikwad <pgaikwad@nvidia.com> Acked-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dma: tegra: use sg_dma_address() for getting dma buffer address	Laxman Dewangan	2012-06-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Use the sg_dma_address() to get the segment buffer address for DMA transfer in place of sg_phys() which returns the physical address of an sg entry. The sg_dma_address() returns the correct buffer memory address for DMA transfer. Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Acked-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dmaengine: mmp_tdma: fix the arch dependency	Vinod Koul	2012-06-23	1	-1/+1
\| \| \| \| \|	Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dw_dmac: introduce dwc_chan_disable	Andy Shevchenko	2012-06-21	1	-18/+14
\| \| \| \| \| \| \| \|	This piece of code is used often. Make it as a separate function. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dw_dmac: move from __init to __devinit	Andy Shevchenko	2012-06-21	1	-3/+3
\| \| \| \| \| \| \| \| \|	We usually have more than one DMA device. Thus, the probe function should serve for all of them in case when the driver is built as a module. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dw_dmac: introduce dwc_fast_fls()	Andy Shevchenko	2012-06-21	1	-28/+18
\| \| \| \| \| \| \| \| \| \|	There were three places where such function is used. We still avoid to use native fls() because in one case it requires to use 64bit version which is suboptimal in our case. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dw_dmac: disable BLOCK interrupts	Andy Shevchenko	2012-06-21	1	-0/+4
\| \| \| \| \| \| \| \|	Just to be sure we are in known state we disable the BLOCK interupts. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dw_dmac: disable dma in optimal way in probe	Andy Shevchenko	2012-06-21	1	-8/+4
\| \| \| \| \| \| \| \| \| \|	The dw_dma_off call needs to have the all_chan_mask calculated. So, done this calculations before the call. Moreover, remove duplicate code that masks the DMA interrupts. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dw_dmac: use __func__ constant in the debug prints	Andy Shevchenko	2012-06-21	1	-14/+13
\| \| \| \| \| \|	Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dw_dmac: print correct number of scanned descriptors	Andy Shevchenko	2012-06-21	1	-1/+1
\| \| \| \| \| \| \| \| \|	In case the first descriptor we found is available, the counter still remains 0 value which is wrong. This patch fixes the counter behaviour. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dw_dmac: introduce dwc_dump_chan_regs to dump registers	Andy Shevchenko	2012-06-21	1	-21/+16
\| \| \| \| \| \| \| \| \|	There is three places where values of the most significant registers were printed. Make such piece of code as separate function. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dw_dmac: use proper casting to print dma_addr_t values	Andy Shevchenko	2012-06-21	1	-8/+13
\| \| \| \| \| \| \| \| \|	dma_addr_t is sometimes 32 bit and sometimes 64. We normally cast them to unsigned long long for printk(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dw_dmac: fix constant in the comment	Andy Shevchenko	2012-06-21	1	-1/+1
\| \| \| \| \| \|	Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dmaengine: mmp_tdma: add mmp tdma support	Zhangfei Gao	2012-06-20	3	-0/+621
\| \| \| \| \| \| \| \| \| \|	Add support for two-channel dma under dmaengine support: mmp-adma and pxa910-squ Signed-off-by: Zhangfei Gao <zhangfei.gao@marvell.com> Signed-off-by: Leo Yan <leoy@marvell.com> Signed-off-by: Qiao Zhou <zhouqiao@marvell.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dmaengine: Add wrapper for device_tx_status callback	Lars-Peter Clausen	2012-06-20	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds a small inline wrapper for the devivce_tx_status callback of a dma device. This makes the source code of users of this function a bit more compact and a bit more legible. E.g.: -status = chan->device->device_tx_status(chan, cookie, &state) +status = dmaengine_tx_status(chan, cookie, &state) Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	Merge branch 'fixes' into next	Vinod Koul	2012-06-14	1	-16/+10
\|\
\| *	DMA: PL330: Fix racy mutex unlock	Javi Merino	2012-06-14	1	-16/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pl330_update() stores a pointer to the thrd->req that finished, which contains a pointer to the corresponding pl330_req. This is done with the pl330_lock held. Then, it iterates through the req_done list, calling the callback for each of the requests that are done. The problem is that the driver releases the lock before calling the callback for each of the callbacks. pl330_submit_req() running in another processor can then acquire the lock and insert another request in one of the thrd->req that hasn't been processed yet, replacing the pointer to pl330_req there. When the callback returns in pl330_update() and the next rqdone is popped from the list, it dereferences the pl330_req pointer to the just scheduled pl330_req, instead of the one that has finished, calling pl330 with the wrong r. This patch fixes this by storing the pointer to pl330_req directly in the list. Signed-off-by: Javi Merino <javi.merino@arm.com> Cc: Jassi Brar <jaswinder.singh@linaro.org> Acked-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* \|	dma: coh901318: use devm allocation	Linus Walleij	2012-06-14	1	-48/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allocate memory, region, remap and irq for device state using devm_* helpers to simplify memory accounting. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* \|	dmaengine: at_hdmac: trivial: fix comment in header	Nicolas Ferre	2012-06-12	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Not all Atmel SoCs were pointed out in header comment which was bringing confusion. Remove the truncated list of supported devices, replace by the only one that is not supported. Reported-by: Elen Song <elen.song@atmel.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* \|	dma: tegra: add dmaengine based dma driver	Laxman Dewangan	2012-06-08	3	-0/+1425
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add dmaengine based NVIDIA's Tegra APB DMA driver. This driver support the slave mode of data transfer from peripheral to memory and vice versa. The driver supports for the cyclic and non-cyclic mode of data transfer. Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Acked-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* \|	dma: dmaengine: add slave req id in slave_config	Laxman Dewangan	2012-06-08	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The DMA controller like Nvidia's Tegra Dma controller supports the different slave requestor id from different slave. This need to be configure in dma controller to handle the request properly. Adding the slave-id in the slave configuration so that information can be passed from client when configuring for slave. Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
* \|	dma: enable mxs-dma for imx6q	Huang Shijie	2012-06-07	2	-2/+1
\|/ \| \| \| \| \| \| \| \|	enable the mxs-dma for imx6q. Also remove the unused header file. Signed-off-by: Huang Shijie <shijie8@gmail.com> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	DMA: PL330: Add missing static storage class specifier	Sachin Kamat	2012-06-07	1	-1/+1
\| \| \| \| \| \| \| \|	Fixes the following sparse warning: drivers/dma/pl330.c:2542:5: warning: symbol 'add_desc' was not declared. Should it be static? Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dma: imx-sdma: buf_tail should be initialize in prepare function	Richard Zhao	2012-06-07	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \|	This fix audio underrun issue. When SNDRV_PCM_TRIGGER_STOP and SNDRV_PCM_TRIGGER_START, it calls prepare again. buf_tail should be reset to zero. So move buf_tail initialization into prepare function. Signed-off-by: Richard Zhao <richard.zhao@freescale.com> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
*	dmaengine: pl330: dont complete descriptor for cyclic dma	Tushar Behera	2012-06-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Commit eab215855803 ("dmaengine: pl330: dont complete descriptor for cyclic dma") wrongly completes descriptor for cyclic dma, hence following BUG_ON is still hit with cyclic DMA operations. kernel BUG at drivers/dma/dmaengine.h:53! Signed-off-by: Tushar Behera <tushar.behera@linaro.org> Acked-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com> Cc: stable <stable@vger.kernel.org>
*	Linux 3.5-rc1v3.5-rc1	Linus Torvalds	2012-06-03	1	-2/+2
\|
*	Merge tag 'dm-3.5-changes-1' of ↵	Linus Torvalds	2012-06-03	6	-90/+322
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm Pull device-mapper updates from Alasdair G Kergon: "Improve multipath's retrying mechanism in some defined circumstances and provide a simple reserve/release mechanism for userspace tools to access thin provisioning metadata while the pool is in use." * tag 'dm-3.5-changes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm: dm thin: provide userspace access to pool metadata dm thin: use slab mempools dm mpath: allow ioctls to trigger pg init dm mpath: delay retry of bypassed pg dm mpath: reduce size of struct multipath
\| *	dm thin: provide userspace access to pool metadata	Joe Thornber	2012-06-03	5	-11/+193
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements two new messages that can be sent to the thin pool target allowing it to take a snapshot of the _metadata_. This, read-only snapshot can be accessed by userland, concurrently with the live target. Only one metadata snapshot can be held at a time. The pool's status line will give the block location for the current msnap. Since version 0.1.5 of the userland thin provisioning tools, the thin_dump program displays the msnap as follows: thin_dump -m <msnap root> <metadata dev> Available here: https://github.com/jthornber/thin-provisioning-tools Now that userland can access the metadata we can do various things that have traditionally been kernel side tasks: i) Incremental backups. By using metadata snapshots we can work out what blocks have changed over time. Combined with data snapshots we can ensure the data doesn't change while we back it up. A short proof of concept script can be found here: https://github.com/jthornber/thinp-test-suite/blob/master/incremental_backup_example.rb ii) Migration of thin devices from one pool to another. iii) Merging snapshots back into an external origin. iv) Asyncronous replication. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
\| *	dm thin: use slab mempools	Mike Snitzer	2012-06-03	1	-62/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use dedicated caches prefixed with a "dm_" name rather than relying on kmalloc mempools backed by generic slab caches so the memory usage of thin provisioning (and any leaks) can be accounted for independently. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
\| *	dm mpath: allow ioctls to trigger pg init	Mikulas Patocka	2012-06-03	1	-9/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After the failure of a group of paths, any alternative paths that need initialising do not become available until further I/O is sent to the device. Until this has happened, ioctls return -EAGAIN. With this patch, new paths are made available in response to an ioctl too. The processing of the ioctl gets delayed until this has happened. Instead of returning an error, we submit a work item to kmultipathd (that will potentially activate the new path) and retry in ten milliseconds. Note that the patch doesn't retry an ioctl if the ioctl itself fails due to a path failure. Such retries should be handled intelligently by the code that generated the ioctl in the first place, noting that some SCSI commands should not be retried because they are not idempotent (XOR write commands). For commands that could be retried, there is a danger that if the device rejected the SCSI command, the path could be errorneously marked as failed, and the request would be retried on another path which might fail too. It can be determined if the failure happens on the device or on the SCSI controller, but there is no guarantee that all SCSI drivers set these flags correctly. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
\| *	dm mpath: delay retry of bypassed pg	Mike Christie	2012-06-03	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If I/O needs retrying and only bypassed priority groups are available, set the pg_init_delay_retry flag to wait before retrying. If, for example, the reason for the bypass is that the controller is getting reset or there is a firmware upgrade happening, retrying right away would cause a flood of log messages and retries for what could be a few seconds or even several minutes. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
\| *	dm mpath: reduce size of struct multipath	Mike Snitzer	2012-06-03	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move multipath structure's 'lock' and 'queue_size' members to eliminate two 4-byte holes. Also use a bit within a single unsigned int for each existing flag (saves 8-bytes). This allows future flags to be added without each consuming an unsigned int. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
* \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	2012-06-03	21	-84/+201
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull networking updates from David Miller: 1) Make syn floods consume significantly less resources by a) Not pre-COW'ing routing metrics for SYN/ACKs b) Mirroring the device queue mapping of the SYN for the SYN/ACK reply. Both from Eric Dumazet. 2) Fix calculation errors in Byte Queue Limiting, from Hiroaki SHIMODA. 3) Validate the length requested when building a paged SKB for a socket, so we don't overrun the page vector accidently. From Jason Wang. 4) When netlabel is disabled, we abort all IP option processing when we see a CIPSO option. This isn't the right thing to do, we should simply skip over it and continue processing the remaining options (if any). Fix from Paul Moore. 5) SRIOV fixes for the mellanox driver from Jack orgenstein and Marcel Apfelbaum. 6) 8139cp enables the receiver before the ring address is properly programmed, which potentially lets the device crap over random memory. Fix from Jason Wang. 7) e1000/e1000e fixes for i217 RST handling, and an improper buffer address reference in jumbo RX frame processing from Bruce Allan and Sebastian Andrzej Siewior, respectively. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: fec_mpc52xx: fix timestamp filtering mcs7830: Implement link state detection e1000e: fix Rapid Start Technology support for i217 e1000: look into the page instead of skb->data for e1000_tbi_adjust_stats() r8169: call netif_napi_del at errpaths and at driver unload tcp: reflect SYN queue_mapping into SYNACK packets tcp: do not create inetpeer on SYNACK message 8139cp/8139too: terminate the eeprom access with the right opmode 8139cp: set ring address before enabling receiver cipso: handle CIPSO options correctly when NetLabel is disabled net: sock: validate data_len before allocating skb in sock_alloc_send_pskb() bql: Avoid possible inconsistent calculation. bql: Avoid unneeded limit decrement. bql: Fix POSDIFF() to integer overflow aware. net/mlx4_core: Fix obscure mlx4_cmd_box parameter in QUERY_DEV_CAP net/mlx4_core: Check port out-of-range before using in mlx4_slave_cap net/mlx4_core: Fixes for VF / Guest startup flow net/mlx4_en: Fix improper use of "port" parameter in mlx4_en_event net/mlx4_core: Fix number of EQs used in ICM initialisation net/mlx4_core: Fix the slave_id out-of-range test in mlx4_eq_int
\| * \|	fec_mpc52xx: fix timestamp filtering	Stephan Gatzka	2012-06-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	skb_defer_rx_timestamp was called with a freshly allocated skb but must be called with rskb instead. Signed-off-by: Stephan Gatzka <stephan@gatzka.org> Cc: stable <stable@vger.kernel.org> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	mcs7830: Implement link state detection	Ondrej Zary	2012-06-02	1	-2/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add .status callback that detects link state changes. Tested with MCS7832CV-AA chip (9710:7830, identified as rev.C by the driver). Fixes https://bugzilla.kernel.org/show_bug.cgi?id=28532 Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	e1000e: fix Rapid Start Technology support for i217	Bruce Allan	2012-06-02	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The definition of I217_PROXY_CTRL must use the BM_PHY_REG() macro instead of the PHY_REG() macro for PHY page 800 register 70 since it is for a PHY register greater than the maximum allowed by the latter macro, and fix a typo setting the I217_MEMPWR register in e1000_suspend_workarounds_ich8lan. Also for clarity, rename a few defines as bit definitions instead of masks. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \|	e1000: look into the page instead of skb->data for e1000_tbi_adjust_stats()	Sebastian Andrzej Siewior	2012-06-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is another fixup where the data is not transfered into buffer addressed by skb->data but into a page. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
\| * \|	r8169: call netif_napi_del at errpaths and at driver unload	Devendra Naga	2012-06-02	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	when register_netdev fails, the init'ed NAPIs by netif_napi_add must be deleted with netif_napi_del, and also when driver unloads, it should delete the NAPI before unregistering netdevice using unregister_netdev. Signed-off-by: Devendra Naga <devendra.aaru@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	tcp: reflect SYN queue_mapping into SYNACK packets	Eric Dumazet	2012-06-01	2	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While testing how linux behaves on SYNFLOOD attack on multiqueue device (ixgbe), I found that SYNACK messages were dropped at Qdisc level because we send them all on a single queue. Obvious choice is to reflect incoming SYN packet @queue_mapping to SYNACK packet. Under stress, my machine could only send 25.000 SYNACK per second (for 200.000 incoming SYN per second). NIC : ixgbe with 16 rx/tx queues. After patch, not a single SYNACK is dropped. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hans Schillstrom <hans.schillstrom@ericsson.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	tcp: do not create inetpeer on SYNACK message	Eric Dumazet	2012-06-01	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Another problem on SYNFLOOD/DDOS attack is the inetpeer cache getting larger and larger, using lots of memory and cpu time. tcp_v4_send_synack() ->inet_csk_route_req() ->ip_route_output_flow() ->rt_set_nexthop() ->rt_init_metrics() ->inet_getpeer( create = true) This is a side effect of commit a4daad6b09230 (net: Pre-COW metrics for TCP) added in 2.6.39 Possible solution : Instruct inet_csk_route_req() to remove FLOWI_FLAG_PRECOW_METRICS Before patch : # grep peer /proc/slabinfo inet_peer_cache 4175430 4175430 192 42 2 : tunables 0 0 0 : slabdata 99415 99415 0 Samples: 41K of event 'cycles', Event count (approx.): 30716565122 + 20,24% ksoftirqd/0 [kernel.kallsyms] [k] inet_getpeer + 8,19% ksoftirqd/0 [kernel.kallsyms] [k] peer_avl_rebalance.isra.1 + 4,81% ksoftirqd/0 [kernel.kallsyms] [k] sha_transform + 3,64% ksoftirqd/0 [kernel.kallsyms] [k] fib_table_lookup + 2,36% ksoftirqd/0 [ixgbe] [k] ixgbe_poll + 2,16% ksoftirqd/0 [kernel.kallsyms] [k] __ip_route_output_key + 2,11% ksoftirqd/0 [kernel.kallsyms] [k] kernel_map_pages + 2,11% ksoftirqd/0 [kernel.kallsyms] [k] ip_route_input_common + 2,01% ksoftirqd/0 [kernel.kallsyms] [k] __inet_lookup_established + 1,83% ksoftirqd/0 [kernel.kallsyms] [k] md5_transform + 1,75% ksoftirqd/0 [kernel.kallsyms] [k] check_leaf.isra.9 + 1,49% ksoftirqd/0 [kernel.kallsyms] [k] ipt_do_table + 1,46% ksoftirqd/0 [kernel.kallsyms] [k] hrtimer_interrupt + 1,45% ksoftirqd/0 [kernel.kallsyms] [k] kmem_cache_alloc + 1,29% ksoftirqd/0 [kernel.kallsyms] [k] inet_csk_search_req + 1,29% ksoftirqd/0 [kernel.kallsyms] [k] __netif_receive_skb + 1,16% ksoftirqd/0 [kernel.kallsyms] [k] copy_user_generic_string + 1,15% ksoftirqd/0 [kernel.kallsyms] [k] kmem_cache_free + 1,02% ksoftirqd/0 [kernel.kallsyms] [k] tcp_make_synack + 0,93% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_lock_bh + 0,87% ksoftirqd/0 [kernel.kallsyms] [k] __call_rcu + 0,84% ksoftirqd/0 [kernel.kallsyms] [k] rt_garbage_collect + 0,84% ksoftirqd/0 [kernel.kallsyms] [k] fib_rules_lookup Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hans Schillstrom <hans.schillstrom@ericsson.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	8139cp/8139too: terminate the eeprom access with the right opmode	Jason Wang	2012-06-01	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, we terminate the eeprom access through clearing the CS by: RTL_W8 (Cfg9346, ~EE_CS); or writeb (~EE_CS, ee_addr); This would left the eeprom into "Config. Register Write Enable:" state which is not expcted as the highest two bits were set to 0x11 ( expected is the "Normal" mode (0x00)). Solving this by write 0x0 instead of ~EE_CS when terminating the eeprom access. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	8139cp: set ring address before enabling receiver	Jason Wang	2012-06-01	1	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, we enable the receiver before setting the ring address which could lead the card DMA into unexpected areas. Solving this by set the ring address before enabling the receiver. btw. I find and test this in qemu as I didn't have a 8139cp card in hand. please review it carefully. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	cipso: handle CIPSO options correctly when NetLabel is disabled	Paul Moore	2012-06-01	1	-1/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When NetLabel is not enabled, e.g. CONFIG_NETLABEL=n, and the system receives a CIPSO tagged packet it is dropped (cipso_v4_validate() returns non-zero). In most cases this is the correct and desired behavior, however, in the case where we are simply forwarding the traffic, e.g. acting as a network bridge, this becomes a problem. This patch fixes the forwarding problem by providing the basic CIPSO validation code directly in ip_options_compile() without the need for the NetLabel or CIPSO code. The new validation code can not perform any of the CIPSO option label/value verification that cipso_v4_validate() does, but it can verify the basic CIPSO option format. The behavior when NetLabel is enabled is unchanged. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net: sock: validate data_len before allocating skb in sock_alloc_send_pskb()	Jason Wang	2012-06-01	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to validate the number of pages consumed by data_len, otherwise frags array could be overflowed by userspace. So this patch validate data_len and return -EMSGSIZE when data_len may occupies more frags than MAX_SKB_FRAGS. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bql: Avoid possible inconsistent calculation.	Hiroaki SHIMODA	2012-06-01	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dql->num_queued could change while processing dql_completed(). To provide consistent calculation, added an on stack variable. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bql: Avoid unneeded limit decrement.	Hiroaki SHIMODA	2012-06-01	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When below pattern is observed, TIME dql_queued() dql_completed() \| a) initial state \| \| b) X bytes queued V c) Y bytes queued d) X bytes completed e) Z bytes queued f) Y bytes completed a) dql->limit has already some value and there is no in-flight packet. b) X bytes queued. c) Y bytes queued and excess limit. d) X bytes completed and dql->prev_ovlimit is set and also dql->prev_num_queued is set Y. e) Z bytes queued. f) Y bytes completed. inprogress and prev_inprogress are true. At f), according to the comment, all_prev_completed becomes true and limit should be increased. But POSDIFF() ignores (completed == dql->prev_num_queued) case, so limit is decreased. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Denys Fedoryshchenko <denys@visp.net.lb> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bql: Fix POSDIFF() to integer overflow aware.	Hiroaki SHIMODA	2012-06-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	POSDIFF() fails to take into account integer overflow case. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Denys Fedoryshchenko <denys@visp.net.lb> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net/mlx4_core: Fix obscure mlx4_cmd_box parameter in QUERY_DEV_CAP	Jack Morgenstein	2012-06-01	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "!mlx4_is_slave" is totally confusing. Fix with constant MLX4_CMD_NATIVE, which is the intended behavior. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net/mlx4_core: Check port out-of-range before using in mlx4_slave_cap	Jack Morgenstein	2012-06-01	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The range check was performed after using the port number. Reverse this to prevent a potential array overflow. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net/mlx4_core: Fixes for VF / Guest startup flow	Jack Morgenstein	2012-06-01	4	-14/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- pass the following parameters: - firmware version (added QUERY_FW paravirtualization for that) - disable Blueflame on slaves. KVM disables write combining on guests, and we get better performance without BF in this case. (This requires QUERY_DEV_CAP paravirtualization, also in this commit) - max qp rdma as destination - get rid of a chunk of "if (0)" dead code Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>