linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge branch 'topic/stratus' into next	Bjorn Helgaas	2012-05-07	1	-0/+3
\|\
\| *	PCI: work around Stratus ftServer broken PCIe hierarchy	Bjorn Helgaas	2012-04-30	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A PCIe downstream port is a P2P bridge. Its secondary interface is a link that should lead only to device 0 (unless ARI is enabled)[1], so we don't probe for non-zero device numbers. Some Stratus ftServer systems have a PCIe downstream port (02:00.0) that leads to both an upstream port (03:00.0) and a downstream port (03:01.0), and 03:01.0 has important devices below it: [0000:02]-+-00.0-[03-3c]--+-00.0-[04-09]--... \-01.0-[0a-0d]--+-[USB] +-[NIC] +-... Previously, we didn't enumerate device 03:01.0, so USB and the network didn't work. This patch adds a DMI quirk to scan all device numbers, not just 0, below a downstream port. Based on a patch by Prarit Bhargava. [1] PCIe spec r3.0, sec 7.3.1 CC: Myron Stowe <mstowe@redhat.com> CC: Don Dutile <ddutile@redhat.com> CC: James Paradis <james.paradis@stratus.com> CC: Matthew Wilcox <matthew.r.wilcox@intel.com> CC: Jesse Barnes <jbarnes@virtuousgeek.org> CC: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* \|	PCI: move mutex locking out of pci_dev_reset function	Konrad Rzeszutek Wilk	2012-05-01	1	-10/+17
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The intent of git commit 6fbf9e7a90862988c278462d85ce9684605a52b2 "PCI: Introduce __pci_reset_function_locked to be used when holding device_lock." was to have a non-locking function that would call pci_dev_reset function. But it fell short of that by just probing and not actually reseting the device. To make that work we need a way to move the lock around device_lock to not be in pci_dev_reset (as the caller of __pci_reset_function_locked already holds said lock). We do this by renaming pci_dev_reset to __pci_dev_reset and bubbling said mutex out of __pci_dev_reset to pci_dev_reset (a wrapper around __pci_dev_reset). The __pci_reset_function_locked can now call __pci_dev_reset without having to worry about the dead-lock. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
*	PCI: Retry BARs restoration for Type 0 headers only	Rafael J. Wysocki	2012-04-17	1	-9/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some shortcomings introduced into pci_restore_state() by commit 26f41062f28d ("PCI: check for pci bar restore completion and retry") have been fixed by recent commit ebfc5b802fa76 ("PCI: Fix regression in pci_restore_state(), v3"), but that commit treats all PCI devices as those with Type 0 configuration headers. That is not entirely correct, because Type 1 and Type 2 headers have different layouts. In particular, the area occupied by BARs in Type 0 config headers contains the secondary status register in Type 1 ones and it doesn't make sense to retry the restoration of that register even if the value read back from it after a write is not the same as the written one (it very well may be different). For this reason, make pci_restore_state() only retry the restoration of BARs for Type 0 config headers. This effectively makes it behave as before commit 26f41062f28d for all header types except for Type 0. Tested-by: Mikko Vinni <mmvinni@yahoo.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	PCI: Fix regression in pci_restore_state(), v3	Rafael J. Wysocki	2012-04-15	1	-18/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 26f41062f28d ("PCI: check for pci bar restore completion and retry") attempted to address problems with PCI BAR restoration on systems where FLR had not been completed before pci_restore_state() was called, but it did that in an utterly wrong way. First off, instead of retrying the writes for the BAR registers only, it did that for all of the PCI config space of the device, including the status register (whose value after the write quite obviously need not be the same as the written one). Second, it added arbitrary delay to pci_restore_state() even for systems where the PCI config space restoration was successful at first attempt. Finally, the mdelay(10) it added to every iteration of the writing loop was way too much of a delay for any reasonable device. All of this actually caused resume failures for some devices on Mikko's system. To fix the regression, make pci_restore_state() only retry the writes for BAR registers and only wait if the first read from the register doesn't return the written value. Additionaly, make it wait for 1 ms, instead of 10 ms, after every failing attempt to write into config space. Reported-by: Mikko Vinni <mmvinni@yahoo.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	PCI / PCIe: Introduce command line option to disable ARI	Rafael J. Wysocki	2012-03-01	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are PCIe devices on the market that report ARI support but then fail to initialize correctly when ARI is actually used. This leads to situations in which kernels 2.6.34 and newer fail to handle systems where the previous kernels worked without any apparent problems. Unfortunately, it is currently unknown how many such devices are there. For this reason, introduce a new kernel command line option, pci=noari, allowing users to disable PCIe ARI altogether if they see problems with PCIe device initialization. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: Move "pci reassigndev resource alignment" out of quirks.c	Yinghai Lu	2012-02-24	1	-0/+62
\| \| \| \| \| \| \| \|	This isn't really a quirk; calling it directly from pci_add_device makes more sense. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: prepare pci=realloc for multiple options	Yinghai Lu	2012-02-24	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Let the user could enable and disable with pci=realloc=on or pci=realloc=off Also 1. move variable and functions near the place they are used. 2. change macro to function 3. change related functions and variable to static and _init 4. update parameter description accordingly. This will let us add a config option to control default behavior, and still allow the user to turn off automatic reallocation if it fails on their platform until a permanent solution is found. -v2: still honor pci=realloc, and treat it as pci=realloc=on also use enum instead of ... -v3: update kernel-paramenters.txt according to Jesse. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: move pci_find_saved_cap out of linux/pci.h	Yinghai Lu	2012-02-23	1	-0/+19
\| \| \| \| \| \| \| \| \|	Only one user in driver/pci/pci.c, so we don't need to put it in global pci.h Reviewed-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: fix memleak for pci dev removing during hotplug	Yinghai Lu	2012-02-23	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	unreferenced object 0xffff880276d17700 (size 64): comm "swapper/0", pid 1, jiffies 4294897182 (age 3976.028s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 18 f9 de 76 02 88 ff ff ...........v.... 10 00 00 00 0e 00 00 00 0f 28 40 00 00 00 00 00 .........(@..... backtrace: [<ffffffff81c8aede>] kmemleak_alloc+0x26/0x43 [<ffffffff811385f0>] __kmalloc+0x121/0x183 [<ffffffff813cf821>] pci_add_cap_save_buffer+0x35/0x7c [<ffffffff813d12b7>] pci_allocate_cap_save_buffers+0x1d/0x65 [<ffffffff813cdb52>] pci_device_add+0x92/0xf1 [<ffffffff81c8afe6>] pci_scan_single_device+0x9f/0xa1 [<ffffffff813cdbd2>] pci_scan_slot.part.20+0x21/0x106 [<ffffffff813cdce2>] pci_scan_slot+0x2b/0x35 [<ffffffff81c8dae4>] __pci_scan_child_bus+0x51/0x107 [<ffffffff81c8d75b>] pci_scan_bridge+0x376/0x6ae [<ffffffff81c8db60>] __pci_scan_child_bus+0xcd/0x107 [<ffffffff81c8dbab>] pci_scan_child_bus+0x11/0x2a [<ffffffff81cca58c>] pci_acpi_scan_root+0x18b/0x21c [<ffffffff81c916be>] acpi_pci_root_add+0x1e1/0x42a [<ffffffff81406210>] acpi_device_probe+0x50/0x190 [<ffffffff814a0227>] really_probe+0x99/0x126 Need to free saved_buffer for capabilities. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: check for pci bar restore completion and retry	Kay, Allen M	2012-02-14	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \|	On some OEM systems, pci_restore_state() is called while FLR has not yet completed. As a result, PCI BAR register restore is not successful. This fix reads back the restored value and compares it with saved value and re-tries 10 times before giving up. Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> Signed-off-by: Eric Chanudet <eric.chanudet@citrix.com> Signed-off-by: Allen Kay <allen.m.kay@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: Introduce __pci_reset_function_locked to be used when holding device_lock.	Konrad Rzeszutek Wilk	2012-02-14	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The use case of this is when a driver wants to call FLR when a device is attached to it using the SysFS "bind" or "unbind" functionality. The call chain when a user does "bind" looks as so: echo "0000:01.07.0" > /sys/bus/pci/drivers/XXXX/bind and ends up calling: driver_bind: device_lock(dev); <=== TAKES LOCK XXXX_probe: .. pci_enable_device() ...__pci_reset_function(), which calls pci_dev_reset(dev, 0): if (!0) { device_lock(dev) <==== DEADLOCK The __pci_reset_function_locked function allows the the drivers 'probe' function to call the "pci_reset_function" while still holding the driver mutex lock. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	kernel-doc: fix new warnings in pci	Randy Dunlap	2012-01-23	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix new kernel-doc warnings: Warning(drivers/pci/pci.c:2811): No description found for parameter 'dev' Warning(drivers/pci/pci.c:2811): Excess function parameter 'pdev' description in 'pci_intx_mask_supported' Warning(drivers/pci/pci.c:2894): No description found for parameter 'dev' Warning(drivers/pci/pci.c:2894): Excess function parameter 'pdev' description in 'pci_check_and_mask_intx' Warning(drivers/pci/pci.c:2908): No description found for parameter 'dev' Warning(drivers/pci/pci.c:2908): Excess function parameter 'pdev' description in 'pci_check_and_unmask_intx' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	PCI: Enable ATS at the device state restore	Hao, Xudong	2012-01-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	During S3 or S4 resume or PCI reset, ATS regs aren't restored correctly. This patch enables ATS at the device state restore if PCI device has ATS capability. Signed-off-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI/PM/Runtime: make PCI traces quieter	Vincent Palatin	2012-01-06	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \|	When the runtime PM is activated on PCI, if a device switches state frequently (e.g. an EHCI controller with autosuspending USB devices connected) the PCI configuration traces might be very verbose in the kernel log. Let's guard those traces with DEBUG condition. Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Vincent Palatin <vpalatin@chromium.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: latency timer doesn't apply to PCIe	Myron Stowe	2012-01-06	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	The latency timer is read-only and hardwired to zero for all PCIe devices, both Type 0 and Type 1, so don't bother trying to update it and cluttering the dmesg log with meaningless "setting latency timer to 64" messages. Signed-off-by: Myron Stowe <myron.stowe@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: Pull PCI 'latency timer' setup up into the core	Myron Stowe	2012-01-06	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 'latency timer' of PCI devices, both Type 0 and Type 1, is setup in architecture-specific code [see: 'pcibios_set_master()']. There are two approaches being taken by all the architectures - check if the 'latency timer' is currently set between 16 and 255 and if not bring it within bounds, or, do nothing (and then there is the gratuitously different PA-RISC implementation). There is nothing architecture-specific about PCI's 'latency timer' so this patch pulls its setup functionality up into the PCI core by creating a generic 'pcibios_set_master()' function using the '__weak' attribute which can be used by all architectures as a default which, if necessary, can then be over-ridden by architecture-specific code. No functional change. Signed-off-by: Myron Stowe <myron.stowe@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: Introduce INTx check & mask API	Jan Kiszka	2012-01-06	1	-0/+110
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These new PCI services allow to probe for 2.3-compliant INTx masking support and then use the feature from PCI interrupt handlers. The services are properly synchronized with concurrent config space access via sysfs or on device reset. This enables generic PCI device drivers like uio_pci_generic or KVM's device assignment to implement the necessary kernel-side IRQ handling without any knowledge about device-specific interrupt status and control registers. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: Rework config space blocking services	Jan Kiszka	2012-01-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pci_block_user_cfg_access was designed for the use case that a single context, the IPR driver, temporarily delays user space accesses to the config space via sysfs. This assumption became invalid by the time pci_dev_reset was added as locking instance. Today, if you run two loops in parallel that reset the same device via sysfs, you end up with a kernel BUG as pci_block_user_cfg_access detect the broken assumption. This reworks the pci_block_user_cfg_access to a sleeping service pci_cfg_access_lock and an atomic-compatible variant called pci_cfg_access_trylock. The former not only blocks user space access as before but also waits if access was already locked. The latter service just returns false in this case, allowing the caller to resolve the conflict instead of raising a BUG. Adaptions of the ipr driver were originally written by Brian King. Acked-by: Brian King <brking@linux.vnet.ibm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	pci: Fix hotplug of Express Module with pci bridges	Yinghai Lu	2011-12-18	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I noticed that hotplug of one setup does not work with recent change in pci tree. After checking the bridge conf setup, I noticed that the bridges get assigned but do not get enabled. The reason is the following commit, while simply ignores bridge resources when enabling a pci device: \| commit bbef98ab0f019f1b0c25c1acdf1683c68933d41b \| Author: Ram Pai <linuxram@us.ibm.com> \| Date: Sun Nov 6 10:33:10 2011 +0800 \| \| PCI: defer enablement of SRIOV BARS \|... \| NOTE: Note, there is subtle change in the pci_enable_device() API. Any \| driver that depends on SRIOV BARS to be enabled in pci_enable_device() \| can fail. Put back bridge resource and ROM resource checking to fix the problem. That should fix regression like BIOS does not assign correct resource to bridge. Discussion can be found at: http://www.spinics.net/lists/linux-pci/msg12874.html Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	PCI: Set device power state to PCI_D0 for device without native PM support	Ajaykumar Hotchandani	2011-12-14	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During test of one IB card with guest VM, found that, msi is not initialized properly. It turns out __write_msi_msg will do nothing if device current_state is not PCI_D0. And, that pci device does not have pm_cap in guest VM. There is an error in setting of power state to PCI_D0 in pci_enable_device(), but error is not returned for this. Following is code flow: pci_enable_device() --> __pci_enable_device_flags() --> do_pci_enable_device() --> pci_set_power_state() --> __pci_start_power_transition() We have following condition inside __pci_start_power_transition(): if (platform_pci_power_manageable(dev)) { error = platform_pci_set_power_state(dev, state); if (!error) pci_update_current_state(dev, state); } else { error = -ENODEV; /* Fall back to PCI_D0 if native PM is not supported */ if (!dev->pm_cap) dev->current_state = PCI_D0; } Here, from platform_pci_set_power_state(), acpi_pci_set_power_state() is getting called and that is failing with ENODEV because of following condition: if (!handle \|\| ACPI_SUCCESS(acpi_get_handle(handle, "_EJ0",&tmp))) return -ENODEV; Because of that, pci_update_current_state() is not getting called. With this patch, if device power state can not be set via platform_pci_set_power_state and that device does not have native pm support, then PCI device power state will be set to PCI_D0. -v2: This also reverts 47e9037ac16637cd7f12b8790ea7ce6680e42168, as it's not needed after this change. Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Ajaykumar Hotchandani<ajaykumar.hotchandani@oracle.com> Signed-off-by: Yinghai Lu<yinghai.lu@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: defer enablement of SRIOV BARS	Ram Pai	2011-12-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All the PCI BARs of a device are enabled when the device is enabled using pci_enable_device(). This unnecessarily enables SRIOV BARs of the device. On some platforms, which do not support SRIOV as yet, the pci_enable_device() fails to enable the device if its SRIOV BARs are not allocated resources correctly. The following patch fixes the above problem. The SRIOV BARs are now enabled when IOV capability of the device is enabled in sriov_enable(). NOTE: Note, there is subtle change in the pci_enable_device() API. Any driver that depends on SRIOV BARS to be enabled in pci_enable_device() can fail. The patch has been touch tested on power and x86 platform. Tested-by: Michael Wang <wangyun@linux.vnet.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	pci: Clamp pcie_set_readrq() when using "performance" settings	Benjamin Herrenschmidt	2011-10-27	1	-2/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When configuring the PCIe settings for "performance", we allow parents to have a larger Max Payload Size than children and rely on children Max Read Request Size to not be larger than their own MPS to avoid having the host bridge generate responses they can't cope with. However, various drivers in Linux call pci_set_readrq() with arbitrary values, assuming this to be a simple performance tweak. This breaks under our "performance" configuration. Fix that by making sure the value programmed by pcie_set_readrq() is never larger than the configured MPS for that device. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jon Mason <mason@myri.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI / PM: Extend PME polling to all PCI devices	Rafael J. Wysocki	2011-10-14	1	-21/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The land of PCI power management is a land of sorrow and ugliness, especially in the area of signaling events by devices. There are devices that set their PME Status bits, but don't really bother to send a PME message or assert PME#. There are hardware vendors who don't connect PME# lines to the system core logic (they know who they are). There are PCI Express Root Ports that don't bother to trigger interrupts when they receive PME messages from the devices below. There are ACPI BIOSes that forget to provide _PRW methods for devices capable of signaling wakeup. Finally, there are BIOSes that do provide _PRW methods for such devices, but then don't bother to call Notify() for those devices from the corresponding _Lxx/_Exx GPE-handling methods. In all of these cases the kernel doesn't have a chance to receive a proper notification that it should wake up a device, so devices stay in low-power states forever. Worse yet, in some cases they continuously send PME Messages that are silently ignored, because the kernel simply doesn't know that it should clear the device's PME Status bit. This problem was first observed for "parallel" (non-Express) PCI devices on add-on cards and Matthew Garrett addressed it by adding code that polls PME Status bits of such devices, if they are enabled to signal PME, to the kernel. Recently, however, it has turned out that PCI Express devices are also affected by this issue and that it is not limited to add-on devices, so it seems necessary to extend the PME polling to all PCI devices, including PCI Express and planar ones. Still, it would be wasteful to poll the PME Status bits of devices that are known to receive proper PME notifications, so make the kernel (1) poll the PME Status bits of all PCI and PCIe devices enabled to signal PME and (2) disable the PME Status polling for devices for which correct PME notifications are received. Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: Disable MPS configuration by default	Jon Mason	2011-10-04	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the ability to disable PCI-E MPS turning and using the BIOS configured MPS defaults. Due to the number of issues recently discovered on some x86 chipsets, make this the default behavior. Also, add the option for peer to peer DMA MPS configuration. Peer to peer DMA is outside the scope of this patch, but MPS configuration could prevent it from working by having the MPS on one root port different than the MPS on another. To work around this, simply make the system wide MPS the smallest possible value (128B). Signed-off-by: Jon Mason <mason@myri.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	PCI: Remove MRRS modification from MPS setting code	Jon Mason	2011-09-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Modifying the Maximum Read Request Size to 0 (value of 128Bytes) has massive negative ramifications on some devices. Without knowing which devices have this issue, do not modify from the default value when walking the PCI-E bus in pcie_bus_safe mode. Also, make pcie_bus_safe the default procedure. Tested-by: Sven Schnelle <svens@stackframe.org> Tested-by: Simon Kirby <sim@hostway.ca> Tested-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-and-tested-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> References: https://bugzilla.kernel.org/show_bug.cgi?id=42162 Signed-off-by: Jon Mason <mason@myri.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	pci: fix new kernel-doc warning in pci.c	Randy Dunlap	2011-08-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Fix new kernel-doc warning in pci.c: Warning(drivers/pci/pci.c:3259): No description found for parameter 'mps' Warning(drivers/pci/pci.c:3259): Excess function parameter 'rq' description in 'pcie_set_mps' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	PCI: Set PCI-E Max Payload Size on fabric	Jon Mason	2011-08-01	1	-0/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On a given PCI-E fabric, each device, bridge, and root port can have a different PCI-E maximum payload size. There is a sizable performance boost for having the largest possible maximum payload size on each PCI-E device. However, if improperly configured, fatal bus errors can occur. Thus, it is important to ensure that PCI-E payloads sends by a device are never larger than the MPS setting of all devices on the way to the destination. This can be achieved two ways: - A conservative approach is to use the smallest common denominator of the entire tree below a root complex for every device on that fabric. This means for example that having a 128 bytes MPS USB controller on one leg of a switch will dramatically reduce performances of a video card or 10GE adapter on another leg of that same switch. It also means that any hierarchy supporting hotplug slots (including expresscard or thunderbolt I suppose, dbl check that) will have to be entirely clamped to 128 bytes since we cannot predict what will be plugged into those slots, and we cannot change the MPS on a "live" system. - A more optimal way is possible, if it falls within a couple of constraints: * The top-level host bridge will never generate packets larger than the smallest TLP (or if it can be controlled independently from its MPS at least) * The device will never generate packets larger than MPS (which can be configured via MRRS) * No support of direct PCI-E <-> PCI-E transfers between devices without some additional code to specifically deal with that case Then we can use an approach that basically ignores downstream requests and focuses exclusively on upstream requests. In that case, all we need to care about is that a device MPS is no larger than its parent MPS, which allows us to keep all switches/bridges to the max MPS supported by their parent and eventually the PHB. In this case, your USB controller would no longer "starve" your 10GE Ethernet and your hotplug slots won't affect your global MPS. Additionally, the hotplugged devices themselves can be configured to a larger MPS up to the value configured in the hotplug bridge. To choose between the two available options, two PCI kernel boot args have been added to the PCI calls. "pcie_bus_safe" will provide the former behavior, while "pcie_bus_perf" will perform the latter behavior. By default, the latter behavior is used. NOTE: due to the location of the enablement, each arch will need to add calls to this function. This patch only enables x86. This patch includes a number of changes recommended by Benjamin Herrenschmidt. Tested-by: Jordan_Hargrave@dell.com Signed-off-by: Jon Mason <mason@myri.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: correct pcie_set_readrq write size	Jon Mason	2011-07-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	When setting the PCI-E MRRS, pcie_set_readrq queries the current settings via a pci_read_config_word call but writes the modified result via a pci_write_config_dword. This results in writing 16 more bits than were queried. Also, the function description comment is slightly incorrect. Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: ARI is a PCIe v2 feature	Chris Wright	2011-07-22	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The function pci_enable_ari() may mistakenly set the downstream port of a v1 PCIe switch in ARI Forwarding mode. This is a PCIe v2 feature, and with an SR-IOV device on that switch port believing the switch above is ARI capable it may attempt to use functions 8-255, translating into invalid (non-zero) device numbers for that bus. This has been seen to cause Completion Timeouts and general misbehaviour including hangs and panics. Cc: stable@kernel.org Acked-by: Don Dutile <ddutile@redhat.com> Tested-by: Don Dutile <ddutile@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: conditional resource-reallocation through kernel parameter pci=realloc	Ram Pai	2011-07-09	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Multiple attempts to dynamically reallocate pci resources have unfortunately lead to regressions. Though we continue to fix the regressions and fine tune the dynamic-reallocation behavior, we have not reached a acceptable state yet. This patch provides a interim solution. It disables dynamic reallocation by default, but adds the ability to enable it through pci=realloc kernel command line parameter. Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	Merge branch 'for-linus' of ↵	Linus Torvalds	2011-06-24	1	-1/+1
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: x86/PCI/ACPI: fix type mismatch PCI: fix new kernel-doc warning PCI: Fix warning in drivers/pci/probe.c on sparc64
\| *	PCI: fix new kernel-doc warning	Randy Dunlap	2011-06-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix pci.c kernel-doc warnings: Warning(drivers/pci/pci.c:3292): No description found for parameter 'flags' Warning(drivers/pci/pci.c:3292): Excess function parameter 'change_bridge_flags' description in 'pci_set_vga_state' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* \|	x86/uv/x2apic: update for change in pci bridge handling.	Dave Airlie	2011-06-14	1	-2/+2
\|/ \| \| \| \| \| \| \| \| \|	When I added 3448a19da479b6bd1e28e2a2be9fa16c6a6feb39 I forgot about the special uv handling code for this, so this patch fixes it up. Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-by: Ingo Molnar Signed-off-by: Dave Airlie <airlied@redhat.com>
*	Merge branch 'drm-core-next' of ↵	Linus Torvalds	2011-05-24	1	-11/+14
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (169 commits) drivers/gpu/drm/radeon/atom.c: fix warning drm/radeon/kms: bump kms version number drm/radeon/kms: properly set num banks for fusion asics drm/radeon/kms/atom: move dig phy init out of modesetting drm/radeon/kms/cayman: fix typo in register mask drm/radeon/kms: fix typo in spread spectrum code drm/radeon/kms: fix tile_config value reported to userspace on cayman. drm/radeon/kms: fix incorrect comparison in cayman setup code. drm/radeon/kms: add wait idle ioctl for eg->cayman drm/radeon/cayman: setup hdp to invalidate and flush when asked drm/radeon/evergreen/btc/fusion: setup hdp to invalidate and flush when asked agp/uninorth: Fix lockups with radeon KMS and >1x. drm/radeon/kms: the SS_Id field in the LCD table if for LVDS only drm/radeon/kms: properly set the CLK_REF bit for DCE3 devices drm/radeon/kms: fixup eDP connector handling drm/radeon/kms: bail early for eDP in hotplug callback drm/radeon/kms: simplify hotplug handler logic drm/radeon/kms: rewrite DP handling drm/radeon/kms/atom: add support for setting DP panel mode drm/radeon/kms: atombios.h updates for DP panel mode ...
\| *	vgaarb: use bridges to control VGA routing where possible.	Dave Airlie	2011-05-04	1	-11/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	So in a lot of modern systems, a GPU will always be below a parent bridge that won't share with any other GPUs. This means VGA arbitration on those GPUs can be controlled by using the bridge routing instead of io/mem decodes. The problem is locating which GPUs share which upstream bridges. This patch attempts to identify all the GPUs which can be controlled via bridges, and ones that can't. This patch endeavours to work out the bridge sharing semantics. When disabling GPUs via a bridge, it doesn't do irq callbacks or touch the io/mem decodes for the gpu. Signed-off-by: Dave Airlie <airlied@redhat.com>
* \|	PCI: Add interfaces to store and load the device saved state	Alex Williamson	2011-05-21	1	-0/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For KVM device assignment, we'd like to save off the state of a device prior to passing it to the guest and restore it later. We also want to allow pci_reset_funciton() to be called while the device is owned by the guest. This however overwrites and invalidates the struct pci_dev buffers, so we can't just manually call save and restore. Add generic interfaces for the saved state to be stored and reloaded back into struct pci_dev at a later time. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* \|	PCI: Track the size of each saved capability data area	Alex Williamson	2011-05-21	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will allow us to store and load it later. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* \|	PCI: add latency tolerance reporting enable/disable support	Jesse Barnes	2011-05-12	1	-0/+149
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Latency tolerance reporting allows devices to send messages to the root complex indicating their latency tolerance for snooped & unsnooped memory transactions. Add support for enabling & disabling this feature, along with a routine to set the max latencies a device should send upstream. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* \|	PCI: add OBFF enable/disable support	Jesse Barnes	2011-05-12	1	-0/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	OBFF (optimized buffer flush/fill), where supported, can help improve energy efficiency by giving devices information about when interrupts and other activity will have a reduced power impact. It requires support from both the device and system (i.e. not only does the device need to respond to OBFF messages, but the platform must be capable of generating and routing them to the end point). Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* \|	PCI: add ID-based ordering enable/disable support	Jesse Barnes	2011-05-12	1	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support to allow drivers to enable/disable ID-based ordering. Where supported, ID-based ordering can significantly improve the latency of individual requests by preventing them from queueing up behind unrelated traffic. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* \|	PCI/PM: Add kerneldoc description of pci_pm_reset()	Rafael J. Wysocki	2011-05-11	1	-0/+15
\|/ \| \| \| \| \| \| \| \| \| \| \|	The pci_pm_reset() function is not a very nice interface due to its limitations and conditional behavior (e.g. it doesn't affect devices in low-power states), but it cannot be simply dropped, because existing device drivers may depend on it. However, its behavior and limitations should be well documented, so add an appropriate kerneldoc comment to it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: PCIe links may not get configured for ASPM under POWERSAVE mode	Naga Chumbalkar	2011-03-21	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v3 -> v2: Moved ASPM enabling logic to pci_set_power_state() v2 -> v1: Preserved the logic in pci_raw_set_power_state() : Added ASPM enabling logic after scanning Root Bridge : http://marc.info/?l=linux-pci&m=130046996216391&w=2 v1 : http://marc.info/?l=linux-pci&m=130013164703283&w=2 The assumption made in commit 41cd766b065970ff6f6c89dd1cf55fa706c84a3d (PCI: Don't enable aspm before drivers have had a chance to veto it) that pci_enable_device() will result in re-configuring ASPM when aspm_policy is POWERSAVE is no longer valid. This is due to commit 97c145f7c87453cec90e91238fba5fe2c1561b32 (PCI: read current power state at enable time) which resets dev->current_state to D0. Due to this the call to pcie_aspm_pm_state_change() is never made. Note the equality check (below) that returns early: ./drivers/pci/pci.c: pci_raw_set_pci_power_state() 546 /* Check if we're already there */ 547 if (dev->current_state == state) 548 return 0; Therefore OSPM never configures the PCIe links for ASPM to turn them "on". Fix it by configuring ASPM from the pci_enable_device() code path. This also allows a driver such as the e1000e networking driver a chance to disable ASPM (L0s, L1), if need be, prior to enabling the device. A driver may perform this action if the device is known to mis-behave wrt ASPM. Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI/PM: Report wakeup events before resuming devices	Rafael J. Wysocki	2011-01-14	1	-1/+1
\| \| \| \| \| \| \| \| \|	Make wakeup events be reported by the PCI subsystem before attempting to resume devices or queuing up runtime resume requests for them, because wakeup events should be reported as soon as they have been detected. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI/PM: Use pm_wakeup_event() directly for reporting wakeup events	Rafael J. Wysocki	2011-01-14	1	-16/+0
\| \| \| \| \| \| \| \| \| \|	After recent changes related to wakeup events pm_wakeup_event() automatically checks if the given device is configured to signal wakeup, so pci_wakeup_event() may be a static inline function calling pm_wakeup_event() directly. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: make pci_restore_state return void	Jon Mason	2010-12-23	1	-5/+2
\| \| \| \| \| \| \| \| \| \| \|	pci_restore_state only ever returns 0, thus there is no benefit in having it return any value. Also, a large majority of the callers do not check the return code of pci_restore_state. Make the pci_restore_state a void return and avoid the overhead. Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: read current power state at enable time	Jesse Barnes	2010-11-11	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we enable a PCI device, we avoid doing a lot of the initial setup work if the device's enable count is non-zero. If we don't fetch the power state though, we may later fail to set up MSI due to the unknown status. So pick it up before we short circuit the rest due to a pre-existing enable or mismatched enable/disable pair (as happens with VGA devices, which are special in a special way). Tested-by: Jesse Brandeburg <jesse.brandeburg@gmail.com> Reported-by: Dave Airlie <airlied@linux.ie> Tested-by: Dave Airlie <airlied@linux.ie> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: Add support for polling PME state on suspended legacy PCI devices	Matthew Garrett	2010-10-18	1	-0/+77
\| \| \| \| \| \| \| \| \| \| \|	Not all hardware vendors hook up the PME line for legacy PCI devices, meaning that wakeup events get lost. The only way around this is to poll the devices to see if their state has changed, so add support for doing that on legacy PCI devices that aren't part of the core chipset. Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	PCI: Adjust confusing if indentation in pcie_get_readrq	Julia Lawall	2010-10-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Indent the branch of an if. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r disable braces4@ position p1,p2; statement S1,S2; @@ ( if (...) { ... } \| if (...) S1@p1 S2@p2 ) @script:python@ p1 << r.p1; p2 << r.p2; @@ if (p1[0].column == p2[0].column): cocci.print_main("branch",p1) cocci.print_secs("after",p2) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
*	Merge branch 'linux-next' of ↵	Linus Torvalds	2010-08-06	1	-4/+0
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (30 commits) PCI: update for owner removal from struct device_attribute PCI: Fix warnings when CONFIG_DMI unset PCI: Do not run NVidia quirks related to MSI with MSI disabled x86/PCI: use for_each_pci_dev() PCI: use for_each_pci_dev() PCI: MSI: Restore read_msi_msg_desc(); add get_cached_msi_msg_desc() PCI: export SMBIOS provided firmware instance and label to sysfs PCI: Allow read/write access to sysfs I/O port resources x86/PCI: use host bridge _CRS info on ASRock ALiveSATA2-GLAN PCI: remove unused HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_{SIZE\|BOUNDARY} PCI: disable mmio during bar sizing PCI: MSI: Remove unsafe and unnecessary hardware access PCI: Default PCIe ASPM control to on and require !EMBEDDED to disable PCI: kernel oops on access to pci proc file while hot-removal PCI: pci-sysfs: remove casts from void* ACPI: Disable ASPM if the platform won't provide _OSC control for PCIe PCI hotplug: make sure child bridges are enabled at hotplug time PCI hotplug: shpchp: Removed check for hotplug of display devices PCI hotplug: pciehp: Fixed return value sign for pciehp_unconfigure_device PCI: Don't enable aspm before drivers have had a chance to veto it ...