summaryrefslogtreecommitdiff
path: root/drivers/edac
AgeCommit message (Collapse)Author
2014-12-08Merge tag 'edac_for_3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds
Pull EDAC updates from Borislav Petkov: "EDAC updates all over the place: - Enablement for AMD F15h models 0x60 CPUs. Most notably DDR4 RAM support. Out of tree stuff is adding the required PCI IDs. From Aravind Gopalakrishnan. - Enable amd64_edac for 32-bit due to popular demand. From Tomasz Pala. - Convert the AMD MCE injection module to debugfs, where it belongs. - Misc EDAC cleanups" * tag 'edac_for_3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC, MCE, AMD: Correct formatting of decoded text EDAC, mce_amd_inj: Add an injector function EDAC, mce_amd_inj: Add hw-injection attributes EDAC, mce_amd_inj: Enable direct writes to MCE MSRs EDAC, mce_amd_inj: Convert mce_amd_inj module to debugfs EDAC: Delete unnecessary check before calling pci_dev_put() EDAC, pci_sysfs: remove unneccessary ifdef around entire file ghes_edac: Use snprintf() to silence a static checker warning amd64_edac: Build module on x86-32 EDAC, MCE, AMD: Add decoding table for MC6 xec amd64_edac: Add F15h M60h support {mv64x60,ppc4xx}_edac,: Remove deprecated IRQF_DISABLED EDAC: Sync memory types and names EDAC: Add DDR3 LRDIMM entries to edac_mem_types x86, amd_nb: Add device IDs to NB tables for F15h M60h pci_ids: Add PCI device IDs for F15h M60h
2014-11-25EDAC, MCE, AMD: Correct formatting of decoded textBorislav Petkov
Write out MCx_ADDR into the more humanly readable "MCx Error Address" and remove double colon in the output. Cc: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-25EDAC, mce_amd_inj: Add an injector functionBorislav Petkov
Selectively inject either a real MCE or a sw-only version which exercises the decoding path only. The hardware-injected MCE triggers a machine check exception (#MC) so that the MCE handler can be bothered to do something too. Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-25EDAC, mce_amd_inj: Add hw-injection attributesBorislav Petkov
Expose struct mce->inject_flags. Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-25EDAC, mce_amd_inj: Enable direct writes to MCE MSRsBorislav Petkov
Normally, writing those causes a #GP but HWCR[McStatusWrEn] controls that. Provide a knob. Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-25EDAC, mce_amd_inj: Convert mce_amd_inj module to debugfsBorislav Petkov
This module's interface belongs in debugfs, not in sysfs. Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-19EDAC: Delete unnecessary check before calling pci_dev_put()Markus Elfring
The pci_dev_put() function tests whether its argument is NULL and then returns immediately. Thus the test before the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Link: http://lkml.kernel.org/r/546CB20D.4070808@users.sourceforge.net [ Boris: commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-11EDAC, pci_sysfs: remove unneccessary ifdef around entire fileAndreas Ruprecht
The file edac_pci_sysfs.c is dependent on CONFIG_PCI. This is already modelled in the Makefile, but edac_pci_sysfs.o is still contained in the list of files compiled even without CONFIG_PCI. This change removes edac_pci_sysfs.o from the list of built objects when not having CONFIG_PCI enabled and removes the then-unnecessary ifdef from the source file. Signed-off-by: Andreas Ruprecht <rupran@einserver.de> Link: http://lkml.kernel.org/r/1407697803-3837-1-git-send-email-rupran@einserver.de Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-11ghes_edac: Use snprintf() to silence a static checker warningDan Carpenter
My static checker complains because the "e->location" has up to 256 characters but we are copying it into the "pvt->detail_location" which only has space for 240 characters. That's not counting the surrounding text and the "e->other_detail" string which can be over 80 characters long. I am not familiar with this code but presumably it normally works. Let's add a limit though for safety. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Link: http://lkml.kernel.org/r/20140801082514.GD28869@mwanda Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-05amd64_edac: Build module on x86-32Tomasz Pala
By popular demand, enable amd64_edac on 32-bit too. Boris: - update Kconfig text. - add a warning on load which states that 32-bit configurations are unsupported. Signed-off-by: Tomasz Pala <gotar@polanet.pl> Link: http://lkml.kernel.org/r/20141102102212.GA7034@polanet.pl Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-04EDAC, MCE, AMD: Add decoding table for MC6 xecAravind Gopalakrishnan
Extended error code meanings are tabulated for other banks. Extend that tradition for MC6 too. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1415122868-10969-1-git-send-email-aravind.gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-30amd64_edac: Add F15h M60h supportAravind Gopalakrishnan
This patch adds support for ECC error decoding for F15h M60h processor. Aside from the usual changes, the patch adds support for some new features in the processor: - DDR4(unbuffered, registered); LRDIMM DDR3 support - relevant debug messages have been modified/added to report these memory types - new dbam_to_cs mappers - if (F15h M60h && LRDIMM); we need a 'multiplier' value to find cs_size. This multiplier value is obtained from the per-dimm DCSM register. So, change the interface to accept a 'cs_mask_nr' value to facilitate this calculation - switch-casing determine_memory_type() - done to cleanse the function of too many if-else statements and improve readability - This is now called early in read_mc_regs() to cache dram_type Misc cleanup: - amd64_pci_table[] is condensed by using PCI_VDEVICE macro. Testing details: Tested the patch by injecting 'ECC' type errors using mce_amd_inj and error decoding works fine. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1414617483-4941-1-git-send-email-Aravind.Gopalakrishnan@amd.com [ Boris: determine_memory_type() cleanups ] Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-22e7xxx_edac: Report CE events properlyJason Baron
Fix CE event being reported as HW_EVENT_ERR_UNCORRECTED. Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/e6dd616f2cd51583a7e77af6f639b86313c74144.1413405053.git.jbaron@akamai.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-22cpc925_edac: Report UE events properlyJason Baron
Fix UE event being reported as HW_EVENT_ERR_CORRECTED. Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/8beb13803500076fef827eab33d523e355d83759.1413405053.git.jbaron@akamai.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-22i82860_edac: Report CE events properlyJason Baron
Fix CE event being reported as HW_EVENT_ERR_UNCORRECTED. Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/7aee8e244a32ff86b399a8f966c4aae70296aae0.1413405053.git.jbaron@akamai.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-22i3200_edac: Report CE events properlyJason Baron
Fix CE event being reported as HW_EVENT_ERR_UNCORRECTED. Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/d02465b4f30314b390c12c061502eda5e9d29c52.1413405053.git.jbaron@akamai.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-20{mv64x60,ppc4xx}_edac,: Remove deprecated IRQF_DISABLEDMichael Opdenacker
It's a NOOP since 2.6.35. Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Link: http://lkml.kernel.org/r/1412159043-7348-1-git-send-email-michael.opdenacker@free-electrons.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-20EDAC: Sync memory types and namesBorislav Petkov
Make keeping the sync between the mem_types enum and the actual string names simpler by using designated initializers. Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-20EDAC: Add DDR3 LRDIMM entries to edac_mem_typesAravind Gopalakrishnan
F15hM60h adds support for DDR4 and DDR3 LRDIMMs. Add them here. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1411070218-10258-1-git-send-email-Aravind.Gopalakrishnan@amd.com [ Boris: improve comments. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-10Merge tag 'edac/v3.18-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac Pull edac updates from Mauro Carvalho Chehab: "Nothing really exiting here: just one bug fix at sb_edac, and some changes to allow other drivers to use some shared PCI addresses" * tag 'edac/v3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: sb_edac: Claim a different PCI device Move Intel SNB device ids from sb_edac to pci_ids.h sb_edac: avoid INTERNAL ERROR message in EDAC with unspecified channel
2014-10-08Merge tag 'drivers-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC driver updates from Arnd Bergmann: "These are changes for drivers that are intimately tied to some SoC and for some reason could not get merged through the respective subsystem maintainer tree. Most of the new code is for the Keystone Navigator driver, which is new base support that is going to be needed for their hardware accelerated network driver and other units. Most of the commits are for moving old code around from at91 and omap for things that are done in device drivers nowadays. - at91: move reset, poweroff, memory and clocksource code into drivers directories - socfpga: add edac driver (through arm-soc, as requested by Boris) - omap: move omap-intc code to drivers/irqchip - sunxi: added an RTC driver for sun6i - omap: mailbox driver related changes - keystone: support for the "Navigator" component - versatile: new reboot, led and soc drivers" * tag 'drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (92 commits) bus: arm-ccn: Fix spurious warning message leds: add device tree bindings for register bit LEDs soc: add driver for the ARM RealView power: reset: driver for the Versatile syscon reboot leds: add a driver for syscon-based LEDs drivers/soc: ti: fix build break with modules MAINTAINERS: Add Keystone Multicore Navigator drivers entry soc: ti: add Keystone Navigator DMA support Documentation: dt: soc: add Keystone Navigator DMA bindings soc: ti: add Keystone Navigator QMSS driver Documentation: dt: soc: add Keystone Navigator QMSS bindings rtc: sunxi: Depend on platforms sun4i/sun7i that actually have the rtc rtc: sun6i: Add sun6i RTC driver irqchip: omap-intc: remove unnecessary comments irqchip: omap-intc: correct maximum number or MIR registers irqchip: omap-intc: enable TURBO idle mode irqchip: omap-intc: enable IP protection irqchip: omap-intc: remove unnecesary of_address_to_resource() call irqchip: omap-intc: comment style cleanup irqchip: omap-intc: minor improvement to omap_irq_pending() ...
2014-10-08sb_edac: Claim a different PCI deviceAndy Lutomirski
sb_edac controls a large number of different PCI functions. Rather than registering as a normal PCI driver for all of them, it registers for just one so that it gets probed and, at probe time, it looks for all the others. Coincidentally, the device it registers for also contains the SMBUS registers, so the PCI core will refuse to probe both sb_edac and a future iMC SMBUS driver. The drivers don't actually conflict, so just change sb_edac's device table to probe a different device. An alternative fix would be to merge the two drivers, but sb_edac will also refuse to load on non-ECC systems, whereas i2c_imc would still be useful without ECC. The only user-visible change should be that sb_edac appears to bind a different device. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Rui Wang <ruiv.wang@gmail.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2014-10-08Move Intel SNB device ids from sb_edac to pci_ids.hAndy Lutomirski
The i2c_imc driver will use two of them, and moving only part of the list seems messier. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2014-10-08sb_edac: avoid INTERNAL ERROR message in EDAC with unspecified channelSeth Jennings
Intel IA32 SDM Table 15-14 defines channel 0xf as 'not specified', but EDAC doesn't know about this and returns and INTERNAL ERROR when the channel is greater than NUM_CHANNELS: kernel: [ 1538.886456] CPU 0: Machine Check Exception: 0 Bank 1: 940000000000009f kernel: [ 1538.886669] TSC 2bc68b22e7e812 ADDR 46dae7000 MISC 0 PROCESSOR 0:306e4 TIME 1390414572 SOCKET 0 APIC 0 kernel: [ 1538.971948] EDAC MC1: INTERNAL ERROR: channel value is out of range (15 >= 4) kernel: [ 1538.972203] EDAC MC1: 0 CE memory read error on unknown memory (slot:0 page:0x46dae7 offset:0x0 grain:0 syndrome:0x0 - area:DRAM err_code:0000:009f socket:1 channel_mask:1 rank:0) This commit changes sb_edac to forward a channel of -1 to EDAC if the channel is not specified. edac_mc_handle_error() sets the channel to -1 internally after the error message anyway, so this commit should have no effect other than avoiding the INTERNAL ERROR message when the channel is not specified. Signed-off-by: Seth Jennings <sjenning@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2014-09-30mpc85xx_edac: Make L2 interrupt shared tooBorislav Petkov
The other two interrupt handlers in this driver are shared, except this one. When loading the driver, it fails like this. So make the IRQ line shared. Freescale(R) MPC85xx EDAC driver, (C) 2006 Montavista Software mpc85xx_mc_err_probe: No ECC DIMMs discovered EDAC DEVICE0: Giving out device to module MPC85xx_edac controller mpc85xx_l2_err: DEV mpc85xx_l2_err (INTERRUPT) genirq: Flags mismatch irq 16. 00000000 ([EDAC] L2 err) vs. 00000080 ([EDAC] PCI err) mpc85xx_l2_err_probe: Unable to request irq 16 for MPC85xx L2 err remove_proc_entry: removing non-empty directory 'irq/16', leaking at least 'aerdrv' ------------[ cut here ]------------ WARNING: at fs/proc/generic.c:521 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.17.0-rc5-dirty #1 task: ee058000 ti: ee046000 task.ti: ee046000 NIP: c016c0c4 LR: c016c0c4 CTR: c037b51c REGS: ee047c10 TRAP: 0700 Not tainted (3.17.0-rc5-dirty) MSR: 00029000 <CE,EE,ME> CR: 22008022 XER: 20000000 GPR00: c016c0c4 ee047cc0 ee058000 00000053 00029000 00000000 c037c744 00000003 GPR08: c09aab28 c09aab24 c09aab28 00000156 20008028 00000000 c0002ac8 00000000 GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000139 c0950394 GPR24: c09f0000 ee5585b0 ee047d08 c0a10000 ee047d08 ee15f808 00000002 ee03f660 NIP [c016c0c4] remove_proc_entry LR [c016c0c4] remove_proc_entry Call Trace: remove_proc_entry (unreliable) unregister_irq_proc free_desc irq_free_descs mpc85xx_l2_err_probe platform_drv_probe really_probe __driver_attach bus_for_each_dev bus_add_driver driver_register mpc85xx_mc_init do_one_initcall kernel_init_freeable kernel_init ret_from_kernel_thread Instruction dump: ... Reported-and-tested-by: <lpb_098@163.com> Acked-by: Johannes Thumshirn <johannes.thumshirn@men.de> Cc: stable@vger.kernel.org Signed-off-by: Borislav Petkov <bp@suse.de>
2014-09-23amd64_edac: Modify usage of amd64_read_dct_pci_cfg()Aravind Gopalakrishnan
Rationale behind this change: - F2x1xx addresses were stopped from being mapped explicitly to DCT1 from F15h (OR) onwards. They use _dct[0:1] mechanism to access the registers. So we should move away from using address ranges to select DCT for these families. - On newer processors, the address ranges used to indicate DCT1 (0x140, 0x1a0) have different meanings than what is assumed currently. Changes introduced: - amd64_read_dct_pci_cfg() now takes in dct value and uses it for 'selecting the dct' - Update usage of the function. Keep in mind that different families have specific handling requirements - Remove [k8|f10]_read_dct_pci_cfg() as they don't do much different from amd64_read_pci_cfg() - Move the k8 specific check to amd64_read_pci_cfg - Remove f15_read_dct_pci_cfg() and move logic to amd64_read_dct_pci_cfg() - Remove now needless .read_dct_pci_cfg Testing: - Tested on Fam 10h; Fam15h Models: 00h, 30h; Fam16h using 'EDAC_DEBUG' and mce_amd_inj - driver obtains info from F2x registers and caches it in pvt structures correctly - ECC decoding works fine Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1410799058-3149-1-git-send-email-aravind.gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-09-15ppc4xx_edac: Fix build error caused by wrong member accessPranith Kumar
Fix the following error drivers/edac/ppc4xx_edac.c:977:45: error: request for member 'dimm' in something not a structure or union by changing member access to pointer dereference. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Link: http://lkml.kernel.org/r/1408482646-22541-1-git-send-email-bobby.prani@gmail.com CC: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Borislav Petkov <bp@suse.de>
2014-09-04edac: altera: Add Altera SDRAM EDAC supportThor Thayer
This patch adds support for the CycloneV and ArriaV SDRAM controllers. Correction and reporting of SBEs, Panic on DBEs. There was a discussion thread on whether this driver should be an mfd driver or just make use of syscon, which is already a mfd. Ultimately, the decision to use a simple syscon interface was reached.[1] [1] https://lkml.org/lkml/2014/7/30/514 [dinguyen] Fixed Kconfig to have EDAC_ALTERA_MC as a tristate to prevent a build failure for allmodconfig. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Acked-by: Borislav Petkov <bp@suse.de> [dinguyen] cleaned up commit message Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com>
2014-09-02EDAC: Fix mem_types strings typeBorislav Petkov
This one got forgotten during an earlier cleanup. Signed-off-by: Borislav Petkov <bp@suse.de>
2014-08-15Merge branch 'linux_next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac Pull EDAC updates from Mauro Carvalho Chehab. * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: sb_edac: add support for Haswell based systems sb_edac: Fix mix tab/spaces alignments edac: add DDR4 and RDDR4 sb_edac: remove bogus assumption on mc ordering sb_edac: make minimal use of channel_mask sb_edac: fix socket detection on Ivy Bridge controllers sb_edac: update Kconfig description sb_edac: search devices using product id sb_edac: make RIR limit retrieval per model sb_edac: make node id retrieval per model sb_edac: make memory type detection per memory controller
2014-08-14Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linuxLinus Torvalds
Pull device tree updates from Grant Likely: "The branch contains the following device tree changes the v3.17 merge window: Group changes to the device tree. In preparation for adding device tree overlay support, OF_DYNAMIC is reworked so that a set of device tree changes can be prepared and applied to the tree all at once. OF_RECONFIG notifiers see the most significant change here so that users always get a consistent view of the tree. Notifiers generation is moved from before a change to after it, and notifiers for a group of changes are emitted after the entire block of changes have been applied Automatic console selection from DT. Console drivers can now use of_console_check() to see if the device node is specified as a console device. If so then it gets added as a preferred console. UART devices get this support automatically when uart_add_one_port() is called. DT unit tests no longer depend on pre-loaded data in the device tree. Data is loaded dynamically at the start of unit tests, and then unloaded again when the tests have completed. Also contains a few bugfixes for reserved regions and early memory setup" * tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux: (21 commits) of: Fixing OF Selftest build error drivers: of: add automated assignment of reserved regions to client devices of: Use proper types for checking memory overflow of: typo fix in __of_prop_dup() Adding selftest testdata dynamically into live tree of: Add todo tasklist for Devicetree of: Transactional DT support. of: Reorder device tree changes and notifiers of: Move dynamic node fixups out of powerpc and into common code of: Make sure attached nodes don't carry along extra children of: Make devicetree sysfs update functions consistent. of: Create unlocked versions of node and property add/remove functions OF: Utility helper functions for dynamic nodes of: Move CONFIG_OF_DYNAMIC code into a separate file of: rename of_aliases_mutex to just of_mutex of/platform: Fix of_platform_device_destroy iteration of devices of: Migrate of_find_node_by_name() users to for_each_node_by_name() tty: Update hypervisor tty drivers to use core stdout parsing code. arm/versatile: Add the uart as the stdout device. of: Enable console on serial ports specified by /chosen/stdout-path ...
2014-08-04Merge branch 'x86-ras-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Ingo Molnar: "The main changes in this cycle are: - RAS tracing/events infrastructure, by Gong Chen. - Various generalizations of the APEI code to make it available to non-x86 architectures, by Tomasz Nowicki" * 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ras: Fix build warnings in <linux/aer.h> acpi, apei, ghes: Factor out ioremap virtual memory for IRQ and NMI context. acpi, apei, ghes: Make NMI error notification to be GHES architecture extension. apei, mce: Factor out APEI architecture specific MCE calls. RAS, extlog: Adjust init flow trace, eMCA: Add a knob to adjust where to save event log trace, RAS: Add eMCA trace event interface RAS, debugfs: Add debugfs interface for RAS subsystem CPER: Adjust code flow of some functions x86, MCE: Robustify mcheck_init_device trace, AER: Move trace into unified interface trace, RAS: Add basic RAS trace event x86, MCE: Kill CPU_POST_DEAD
2014-07-14EDAC, MCE, AMD: Add MCE decoding for F15h M60hAravind Gopalakrishnan
Add decoding logic for new Fam15h model 60h. Tested using mce_amd_inj module and works fine. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1405098795-4678-1-git-send-email-Aravind.Gopalakrishnan@amd.com [ Boris: simplify a bit. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2014-07-10ie31200_edac: Allocate mci and map mchbar firstJason Baron
Check for memory allocation and mchbar mapping failures before initializing the dimm info tables needlessly. Signed-off-by: Jason Baron <jbaron@akamai.com> Suggested-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/ead8f53e699f1ce21c2e17f3cffb4685d4faf72a.1404939455.git.jbaron@akamai.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-07-04ie31200_edac: Introduce the driverJason Baron
Add a driver for the E3-1200 series of Intel DRAM controllers, based on the following E3-1200 specs: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200v3-vol-2-datasheet.html I've tested this on bad memory hardware, and observed correlating bad reads and uncorrected memory errors as reported by the driver. Tested against: CPU E3-1270 v3 @ 3.50GHz : 8086:0c08 (haswell) CPU E3-1270 V2 @ 3.50GHz : 8086:0158 (ivy bridge) CPU E31270 @ 3.40GHz : 8086:0108 (sandy bridge) Signed-off-by: Jason Baron <jbaron@akamai.com> Link: http://lkml.kernel.org/r/95c83e80dd40b5377e8bb206285c5d95ac623872.1403818526.git.jbaron@akamai.com [ Boris: realign defines ] Signed-off-by: Borislav Petkov <bp@suse.de>
2014-07-04x38_edac: make use of lo_hi_readq()Jason Baron
Convert to the generic API. Signed-off-by: Jason Baron <jbaron@akamai.com> Link: http://lkml.kernel.org/r/bb9a4cbb980cc7b51be75cbfcf644553bf6a04cd.1403818526.git.jbaron@akamai.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-06-26sb_edac: add support for Haswell based systemsAristeu Rozanski
Haswell memory controllers are very similar to Ivy Bridge and Sandy Bridge ones. This patch adds support to Haswell based systems. [m.chehab@samsung.com: Fix CodingStyle issues] Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-06-26sb_edac: Fix mix tab/spaces alignmentsMauro Carvalho Chehab
We should not have spaces before ^I on alignments. Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-06-26edac: add DDR4 and RDDR4Aristeu Rozanski
Haswell memory controller can make use of DDR4 and Registered DDR4 Cc: tony.luck@intel.com Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-06-26sb_edac: remove bogus assumption on mc orderingAristeu Rozanski
When a MC is handled, the correct sbridge_dev is searched based on the node, checking again later with the assumption the first memory controller found is the first socket's memory controller is a bogus assumption. Get rid of it. Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-06-26sb_edac: make minimal use of channel_maskAristeu Rozanski
channel_mask will be used in the future to determine which group of memory modules is causing the errors since when mirroring, lockstep and close page are enabled you can't. While that doesn't happen, use the channel_mask to determine the channel instead of relying on the MC event/exception. Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-06-26sb_edac: fix socket detection on Ivy Bridge controllersAristeu Rozanski
This patch fixes the obvious bug while handling the socket/HA bitmask used in Ivy Bridge memory controllers. Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-06-26sb_edac: update Kconfig descriptionAristeu Rozanski
Kconfig wasn't updated when Ivy Bridge support was added. Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-06-26sb_edac: search devices using product idAristeu Rozanski
This patch changes the way devices are searched by using product id instead of device/function numbers. Tested in a Sandy Bridge and a Ivy Bridge machine to make sure everything works properly. Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-06-26sb_edac: make RIR limit retrieval per modelAristeu Rozanski
Haswell has a different way to retrieve RIR limits, make this procedure per model. Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-06-26sb_edac: make node id retrieval per modelAristeu Rozanski
Haswell has a different way to retrieve the node id, make so this procedure can be reimplemented. Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-06-26sb_edac: make memory type detection per memory controllerAristeu Rozanski
Haswell has different register, offset to determine memory type and supports DDR4 in some models. This patch makes it easier to have a different method depending on the memory controller type. Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
2014-06-26of: Migrate of_find_node_by_name() users to for_each_node_by_name()Grant Likely
There are a bunch of users open coding the for_each_node_by_name() by calling of_find_node_by_name() directly instead of using the macro. This is getting in the way of some cleanups, and the possibility of removing of_find_node_by_name() entirely. Clean it up so that all the users are consistent. Signed-off-by: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Takashi Iwai <tiwai@suse.de>
2014-06-24EDAC, edac_module.c: Remove unnecessary test on unsigned valueFabian Frederick
unsigned long value is never < 0. Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Link: http://lkml.kernel.org/r/1402341618-10674-1-git-send-email-fabf@skynet.be Signed-off-by: Borislav Petkov <bp@suse.de>
2014-06-23trace, RAS: Add basic RAS trace eventChen, Gong
To avoid confuision and conflict of usage for RAS related trace event, add an unified RAS trace event stub. Start a RAS subsystem menu which will be fleshed out in time, when more features get added to it. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1402475691-30045-2-git-send-email-gong.chen@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com>