summaryrefslogtreecommitdiffstats
path: root/drivers/edac (follow)
Commit message (Collapse)AuthorAgeFilesLines
* edac: fix enabling of polling cell moduleBenjamin Herrenschmidt2008-10-301-0/+3
| | | | | | | | | | | | | The edac driver on cell turned out to be not enabled because of a missing op_state. This patch introduces it. Verified to work on top of Ben's next branch. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Osterkamp <jens@linux.vnet.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac x38: new MC driver moduleHitoshi Mitake2008-10-303-0/+532
| | | | | | | | | | | | | | | | | I wrote a new module for Intel X38 chipset. This chipset is very similar to Intel 3200 chipset, but there are some different points, so I copyed i3200_edac.c and modified. This is Intel's web page describing this chipset. http://www.intel.com/Products/Desktop/Chipsets/X38/X38-overview.htm I've tested this new module with broken memory, and it seems to be working well. Signed-off-by: Hitoshi Mitake <mitake@clustcom.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac cell: fix incorrect edac_modeBenjamin Herrenschmidt2008-10-201-1/+1
| | | | | | | | | | | | The cell_edac driver is setting the edac_mode field of the csrow's to an incorrect value, causing the sysfs show routine for that field to go out of an array bound and Oopsing the kernel when used. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: <stable@kernel.org> [2.6.27.x, 2.6.26.x. 2.6.25.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac i5000: fix thermal issuesAristeu Rozanski2008-10-161-1/+16
| | | | | | | | | | | | | | | | | | Make the Thermal messages (temperature got past Tmid) be displayed only once because: 1) it's the BIOS job to configure and handle the memory throttling 2) if the BIOS is broken or is aware about the condition, flooding the system logs won't help anything. 3) According to the specification update for Intel 5000 MCHs, all the revisions of this MCH have problems on the thermal sensors, making not automatic (a.k.a. intelligent thermal throttling) impossible. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac i5000: fix error messagesAristeu Rozanski2008-10-161-62/+119
| | | | | | | | | | | | | | | Update the i5000_edac messages, making everything pass through the EDAC (so the log controls will work) and being more specific about the errors. Also, it makes the miscellaneous errors optional and disabled by default. As I didn't found anywhere information about M23ERR-M26ERR (FERR_NF_THERMAL) on FERR_NF_FBD, I'm removing them. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac mpc85xx: add support for mpc8572Andrew Kilkenny2008-10-161-7/+26
| | | | | | | | | | | | | | This adds support for the dual-core MPC8572 processor. We have to support making SPR changes on each core. Also, since we can have multiple memory controllers sharing an interrupt, flag the interrupts with IRQF_SHARED. Signed-off-by: Andrew Kilkenny <akilkenny@xes-inc.com> Signed-off-by: Nate Case <ncase@xes-inc.com> Acked-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: make i82443bxgx_edac coexist with intel_agpVladislav Bogdanov2008-10-161-2/+61
| | | | | | | | | | | | | | | | | | | | | | | | Fix 443BX/GX MCH suppport in a EDAC. It makes i82443bxgx_edac coexist with intel_agp using the same approach as several other EDAC drivers. Tested on Intel's L443GX with redhat's 2.6.18 with whole EDAC subsystem backported a while ago. [root@host ~]# dmesg|grep -iE '(AGP|EDAC)' Linux agpgart interface v0.101 (c) Dave Jones agpgart: Detected an Intel 440GX Chipset. agpgart: AGP aperture is 64M @ 0xf8000000 EDAC MC: Ver: 2.1.0 Jun 27 2008 EDAC MC0: Giving out device to 'i82443bxgx_edac' 'I82443BXGX': DEV 0000:00:00.0 EDAC PCI0: Giving out device to module 'i82443bxgx_edac' controller 'EDAC PCI controller': DEV '0000:00:00.0' (POLLED) Signed-off-by: Vladislav Bogdanov <slava@nsys.by> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* removed unused #include <linux/version.h>'sAdrian Bunk2008-08-231-1/+0
| | | | | | | | This patch lets the files using linux/version.h match the files that #include it. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: mpc85xx fix pci ofdev 2nd passDave Jiang2008-07-251-24/+43
| | | | | | | | | | | Convert PCI err device from platform to open firmware of_dev to comply with powerpc schemes. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: mv64x60 add pci fixupDave Jiang2008-07-251-0/+35
| | | | | | | | | | Fixup of missing bit 0 on 64360 PCIx_ERR_MASK and errata FEr-#11 and FEr-#16 for the 64460. Bit 0 must remain 0. Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: mv64x60 fix get_propertyDave Jiang2008-07-251-1/+1
| | | | | | | | | Update get_property() call to use of_get_property() in order to fix compile Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: e752x fix too loud on nonmemory errorsDoug Thompson2008-07-251-17/+42
| | | | | | | | | | | | | | | | This module harvests more than just memory errors, it also harvests various bus and dma errors that the Chipset detects. Previously, it would report all such errors, which would cause output to be TOO loud. This patches therefore adds a parameter which is used to turn off NON-MEMORY error reports by default. Or the reporting can be enabled via the parameter Also did code style cleanup: less than 80 characters per line rule Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: core fix added newline to sysfs dimm labelsArthur Jones2008-07-251-1/+5
| | | | | | | | | | | | | | | | | | | | | The channel DIMM label does not seem to be used much in the edac code. However, where it is used (in the core code), it is assumed to not have a newline embedded. This leaves the sysfs file newline free which looks funny when cat'ing it. Here we just add the trailing newline to the sysfs chX_dimm_label output... [Doug Thompson note: the DIMM label is one of the primary uses of EDAC. User space daemon scripts, edac-utils@sourceforge, populate the DIMM label fields, via /sys/devices/system/edac attributes, with the silk screen labels of the motherboard in use. dmidecode access BIOS tables, but BIOS tables are well known to be incorrect and useless in these respects. edac-utils will strip off any newlines before its use of the output, when displaying DIMM slot silk screen labels. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: core fix static to dynamic ksetArthur Jones2008-07-251-9/+6
| | | | | | | | | | | Static kobjects and ksets are not supported in Linux kernel. Convert the mc_kset from static to dynamic. This patch depends on my previous patch to remove the module parameter attributes from mc... Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: core fix redundant sysfs controls to parametersArthur Jones2008-07-251-116/+1
| | | | | | | | | | | | /sys/devices/system/edac/mc has a few files which are duplicated in /sys/module/edac_core/parameters. Now that all the functionality is duplicated between these two locations, we remove the former kobject attributes and update the documentation. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: core fix workq timerArthur Jones2008-07-251-1/+21
| | | | | | | | | | | | | | | | | | When updating the edac_mc_poll_msec module parameter from the sysfs /sys/module/edac_core/parameters/edac_mc_poll_msec file, we don't update the workq timers. So that, if we move from a big poll time to a small one, the small one won't take effect until the big one has timed out. Here we provide a new module parameter set method to call out to the update routine. This brings the /sys/module/edac_core/parameters functionality up to that provided by the /sys/drivers/system/edac/mc sysfs module parameter files so that we can remove them or at least link to the /sys/module files... Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: core fix to use dynamic kobjectArthur Jones2008-07-251-9/+21
| | | | | | | | | | | | | | | | | | | | | Static kobjects are not supported in linux kernel. Convert the edac_pci_top_main_kobj from static to dynamic. This avoids the double free of the edac_pci_top_main_kobj.name that we see on module reload of the e752x edac driver (and probably others as well). In addition Greg KH <greg@kroah.com> has pointed out that this code may be cleaned up significantly. I will look at that as a follow-on patch, for now, I just want the minimum fix to get this double-free oops bug squashed... Many thanks to Greg KH for his patience in showing me what the Documentation/kobject.txt already said (oops)... Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: i5100: cleanupArthur Jones2008-07-251-135/+261
| | | | | | | | | | | | | | | | | | | Some code cleanliness issues found by Andrew Morton (thanks!) which should not affect functionality, but which should help make the code more maintainable. In particular, we now: * convert all #define's w/ a parameter to static inlines * use 1UL rather than 1ULL when calculating an unsigned long * use pci_disable_device The resulting code is tested and seems to work fine... Signed-off-by: Arthur Jones <ajones@riverbed.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: i5100 fix unmask ecc bitsArthur Jones2008-07-251-0/+6
| | | | | | | | | Explicitly unmask ECC errors we are interested in reporting. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: i5100 fix enable ecc hardwareArthur Jones2008-07-251-0/+10
| | | | | | | | | | It is possible that the BIOS did not enable ECC at boot time. We check for that case and fail to load if it is true. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: i5100 fix missing bitsArthur Jones2008-07-251-3/+15
| | | | | | | | | | | The error mask we use to trigger ECC notifications is missing many bits of interest. We add these bits here so that all possible ECC errors can be reported. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: i5100 new intel chipset driverArthur Jones2008-07-253-0/+835
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preliminary support for the Intel 5100 MCH. CE and UE errors are reported along with the current DIMM label information and other memory parameters. Reasons why this is preliminary: 1) This chip has 2 independent memory controllers which, for best perforance, use interleaved accesses to the DDR2 memory. This architecture does not map very well to the current edac data structures which depend on symmetric channel access to the interleaved data. Without core changes, the best I could do for now is to map both memory controllers to different csrows (first all ranks of controller 0, then all ranks of controller 1). Someone much more familiar with the edac core than I will probably need to come up with a more general data structure to handle the interleaving and de-interleaving of the two memory controllers. 2) I have not yet tackled the de-interleaving of the rank/controller address space into the physical address space of the CPU. There is nothing fundamentally missing, it is just ending up to be a lot of code, and I'd rather keep it separate for now, esp since it doesn't work yet... 3) The code depends on a particular i5100 chip select to DIMM mainboard chip select mapping. This mapping seems obvious to me in order to support dual and single ranked memory, but it is not unique and DIMM labels could be wrong on other mainboards. There is no way to query this mapping that I know of. 4) The code requires that the i5100 is in 32GB mode. Only 4 ranks per controller, 2 ranks per DIMM are supported. I do not have hardware (nor do I expect to have hardware anytime soon) for the 48GB (6 ranks per controller) mode. 5) The serial presence detect code should be broken out into a "real" i2c driver so that decode-dimms.pl can work. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* powerpc/cell/edac: Log a syndrome code in case of correctable errorMaxim Shchetynin2008-07-221-2/+3
| | | | | | | | | | | If correctable error occurs the syndrome code was logged as 0. This patch lets EDAC to log a correct syndrome code to make problem investigation easier. Signed-off-by: Maxim Shchetynin <maxim@de.ibm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* edac: mpc85xx: fix building as a moduleKumar Gala2008-05-241-3/+0
| | | | | | | | | | | | | including of <asm/mpc85xx.h> causes build problems since it doesn't exist. Also removed warning: drivers/edac/mpc85xx_edac.c:45: warning: 'mpc85xx_ctl_name' defined but not used Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Acked-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* dev_name introduction fall out fixStephen Rothwell2008-05-064-10/+10
| | | | | | | | | | | | | | | Commit 06916639e2fed9ee475efef2747a1b7429f8fe76 ("driver-core: add dev_name() to help transition away from using bus_id") added a static inline dev_name() and used it in dev_printk. Unfortunately, drivers/edac/edac_core.h defines a macro called dev_name(). Rename the latter. Diagnosis by Tony Breeds and Michael Ellerman. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pasemi_edac needs to include linux/edac.hStephen Rothwell2008-04-301-0/+1
| | | | | | | | | | Commit c3c52bce6993c6d37af2c2de9b482a7013d646a7 ("edac: fix module initialization on several modules 2nd time") added a call to opstate_init but did not include linux/edac.h that declares it. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: fix module initialization on several modules 2nd timeHitoshi Mitake2008-04-2911-40/+66
| | | | | | | | | | | | | | | | | | | | | I implemented opstate_init() as a inline function in linux/edac.h. added calling opstate_init() to: i82443bxgx_edac.c i82860_edac.c i82875p_edac.c i82975x_edac.c I wrote a fixed patch of edac-fix-module-initialization-on-several-modules.patch, and tested building 2.6.25-rc7 with applying this. It was succeed. I think the patch is now correct. Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: remove unneeded functions and add static accessorAdrian Bunk2008-04-295-59/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Collection of patches, merged into one, from Adrian that do the following: 1) This patch makes the following needlessly global functions static: - edac_pci_get_log_pe() - edac_pci_get_log_npe() - edac_pci_get_panic_on_pe() - edac_pci_unregister_sysfs_instance_kobj() - edac_pci_main_kobj_setup() 2) Remove unneeded function edac_device_find() 3) Added #if 0 around function edac_pci_find() 4) make the needlessly global edac_pci_generic_check() static 5) Removed function edac_check_mc_devices() Doug Thompson modified Adrian's patches, to bettern represent the direction of EDAC, and make them one patch. Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: use the shorter LIST_HEAD for brevityRobert P. J. Day2008-04-293-3/+3
| | | | | | | Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Acked-by: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: add e752x parameter for sysbus_parity selectionPeter Tyser2008-04-291-1/+39
| | | | | | | | | | | | | | | | | | | | | Add a module parameter "sysbus_parity" to allow forcing system bus parity error checking on or off. Also add support to automatically disable system bus parity errors for processors which do not support it. If the sysbus_parity parameter is specified, sysbus parity detection will be forced on or off. If it is not specified, the driver will attempt to look at the CPU identifier string and determine if the CPU supports system bus parity. A blacklist was used instead of a whitelist so that system bus parity would be enabled by default and to minimize the chances of breaking things for those people already using the driver which for some reason have a processor that does not have a valid CPU identifier string. [akpm@linux-foundation.org: coding-style fixes] Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Peter Tyser <ptyser@xes-inc.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* edac: new support for Intel 3100 chipsetAndrei Konovalov2008-04-292-13/+154
| | | | | | | | | | Add Intel 3100 chipset support to e752x EDAC driver. Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrei Konovalov <akonovalov@ru.mvista.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers/edac/i3000: document type promotionJason Uhlenkott2008-02-071-0/+7
| | | | | | | | | | By popular request, add a comment documenting the implicit type promotion here. Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers/edac: i3000: missing init codeHitoshi Mitake2008-02-071-0/+13
| | | | | | | | | | There is a missing sequence of initialization code during startup. Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com> Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com> Signed-off-by: Doug Thompson <dougthompson@xmisson.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers/edac: mpc85xx: add static scopeDoug Thompson2008-02-071-1/+1
| | | | | | | | Made a previous global variable, static in scope Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers/edac: i3000: 64bit buildJason Uhlenkott2008-02-071-1/+1
| | | | | | | | | | | Modified to run on x86_64 as well as x86 i3000_edac builds (and runs) fine on x86_64. Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers/edac: pci: broken parity regressionBryan Boatright2008-02-071-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using the EDAC code in kernel.org kernel version 2.6.23.8 I am seeing the following problem: In the kernel there is a pci device attribute located in sysfs that is checked by the EDAC PCI scanning code. If that attribute is set, PCI parity/error scannining is skipped for that device. The attribute is: broken_parity_status as is located in /sys/devices/pci<XXX>/0000:XX:YY.Z directorys for PCI devices. I don't think this check was actually implemented. I have a misbehaved card that reports a parity error every 1000 ms: Nov 25 07:28:43 beta kernel: EDAC PCI: Master Data Parity Error on 0000:05:01.0 Nov 25 07:28:44 beta kernel: EDAC PCI: Master Data Parity Error on 0000:05:01.0 Nov 25 07:28:45 beta kernel: EDAC PCI: Master Data Parity Error on 0000:05:01.0 Setting that card's broken_parity_status bit did not mask the error: echo "1" > /sys/bus/pci/devices/0000:05:01.0/broken_parity_status I looked through the EDAC code and did not readily see any reference to broken_parity_status at all (which makes sense based on the behavior I am seeing). I applied the following patch as a proof-of-concept and now EDAC's PCI parity error reporting behaves as documented: bryan Good regression find, bryan. It used to work. sigh. I added more logic to your patch, for more coverage of the error. Doug T Signed-off-by: Bryan Boatright <b1@omega71.com> Signed-off-by: Doug Thompson <dougthompson@xmisson.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers-edac: add marvell mv64x60 driverDave Jiang2008-02-074-0/+977
| | | | | | | | | | | | Marvell mv64x60 SoC support for EDAC. Used on PPC and MIPS platforms. Development and testing done on PPC Motorola prpmc2800 ATCA board. [akpm@linux-foundation.org: make mv64x60_ctl_name static] Signed-off-by: Dave Jiang <djiang@mvista.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers-edac: add freescale mpc85xx driverDave Jiang2008-02-074-0/+1213
| | | | | | | | | | EDAC chip driver support for Freescale MPC85xx platforms. PPC based. Signed-off-by: Dave Jiang <djiang@mvista.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers-edac: i3000 replace macros with functionsJason Uhlenkott2008-02-071-15/+35
| | | | | | | | | | Replace function-like macros with functions. Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers-edac: i3000 code tidyingJason Uhlenkott2008-02-071-97/+110
| | | | | | | | | | Style cleanup, mostly just 80-column fixes. Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers-edac: add Cell MC driverBenjamin Herrenschmidt2008-02-073-0/+266
| | | | | | | | | | | | | | Adds driver for the Cell memory controller when used without a Hypervisor such as on the IBM Cell blades. There might still be some improvements to do to this such as finding if it's possible to properly obtain more details about the address of the error but it's good enough already to report CE counts which is our main priority at the moment. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers-edac: add Cell XDR memory typesBenjamin Herrenschmidt2008-02-072-1/+4
| | | | | | | | | | | Add the definitions for the Rambus XDR memory type used by the Cell processor. It's a pre-requisite for the followup Cell EDAC patch. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers-edac: use round_jiffies_relativeAnton Blanchard2008-02-072-3/+3
| | | | | | | | | | | When rounding a relative timeout we need to use round_jiffies_relative(). Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers-edac: turn on edac device error loggingDoug Thompson2008-02-071-0/+4
| | | | | | | | | | ENABLE the 'logging' of CE and UE events for the EDAC_DEVICE class of error harvester in EDAC Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers/edac/: Spelling fixesJoe Perches2008-02-032-2/+2
| | | | | Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>
* Merge branch 'linux-2.6'Paul Mackerras2008-01-314-82/+44
|\
| * Driver core: change sysdev classes to use dynamic kobject namesKay Sievers2008-01-251-1/+1
| | | | | | | | | | | | | | | | | | | | All kobjects require a dynamically allocated name now. We no longer need to keep track if the name is statically assigned, we can just unconditionally free() all kobject names on cleanup. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
| * Kobject: convert drivers/* from kobject_unregister() to kobject_put()Greg Kroah-Hartman2008-01-253-14/+14
| | | | | | | | | | | | | | | | | | | | | | There is no need for kobject_unregister() anymore, thanks to Kay's kobject cleanup changes, so replace all instances of it with kobject_put(). Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
| * Kobject: change drivers/edac to use kobject_init_and_addGreg Kroah-Hartman2008-01-253-67/+29
| | | | | | | | | | | | | | | | | | | | Stop using kobject_register, as this way we can control the sending of the uevent properly, after everything is properly initialized. Acked-by: Doug Thompson <dougthompson@xmission.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
| * kobject: remove struct kobj_type from struct ksetGreg Kroah-Hartman2008-01-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't need a "default" ktype for a kset. We should set this explicitly every time for each kset. This change is needed so that we can make ksets dynamic, and cleans up one of the odd, undocumented assumption that the kset/kobject/ktype model has. This patch is based on a lot of help from Kay Sievers. Nasty bug in the block code was found by Dave Young <hidave.darkstar@gmail.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>