Age | Commit message (Collapse) | Author |
|
One of the arguments during the suspend blockers discussion was that
the mainline kernel didn't contain any mechanisms making it possible
to avoid races between wakeup and system suspend.
Generally, there are two problems in that area. First, if a wakeup
event occurs exactly when /sys/power/state is being written to, it
may be delivered to user space right before the freezer kicks in, so
the user space consumer of the event may not be able to process it
before the system is suspended. Second, if a wakeup event occurs
after user space has been frozen, it is not generally guaranteed that
the ongoing transition of the system into a sleep state will be
aborted.
To address these issues introduce a new global sysfs attribute,
/sys/power/wakeup_count, associated with a running counter of wakeup
events and three helper functions, pm_stay_awake(), pm_relax(), and
pm_wakeup_event(), that may be used by kernel subsystems to control
the behavior of this attribute and to request the PM core to abort
system transitions into a sleep state already in progress.
The /sys/power/wakeup_count file may be read from or written to by
user space. Reads will always succeed (unless interrupted by a
signal) and return the current value of the wakeup events counter.
Writes, however, will only succeed if the written number is equal to
the current value of the wakeup events counter. If a write is
successful, it will cause the kernel to save the current value of the
wakeup events counter and to abort the subsequent system transition
into a sleep state if any wakeup events are reported after the write
has returned.
[The assumption is that before writing to /sys/power/state user space
will first read from /sys/power/wakeup_count. Next, user space
consumers of wakeup events will have a chance to acknowledge or
veto the upcoming system transition to a sleep state. Finally, if
the transition is allowed to proceed, /sys/power/wakeup_count will
be written to and if that succeeds, /sys/power/state will be written
to as well. Still, if any wakeup events are reported to the PM core
by kernel subsystems after that point, the transition will be
aborted.]
Additionally, put a wakeup events counter into struct dev_pm_info and
make these per-device wakeup event counters available via sysfs,
so that it's possible to check the activity of various wakeup event
sources within the kernel.
To illustrate how subsystems can use pm_wakeup_event(), make the
low-level PCI runtime PM wakeup-handling code use it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: markgross <markgross@thegnar.org>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
|
|
If we fail to assign resources to a PCI BAR, this patch makes us try the
original address from BIOS rather than leaving it disabled.
Linux tries to make sure all PCI device BARs are inside the upstream
PCI host bridge or P2P bridge apertures, reassigning BARs if necessary.
Windows does similar reassignment.
Before this patch, if we could not move a BAR into an aperture, we left
the resource unassigned, i.e., at address zero. Windows leaves such BARs
at the original BIOS addresses, and this patch makes Linux do the same.
This is a bit ugly because we disable the resource long before we try to
reassign it, so we have to keep track of the BIOS BAR address somewhere.
For lack of a better place, I put it in the struct pci_dev.
I think it would be cleaner to attempt the assignment immediately when the
claim fails, so we could easily remember the original address. But we
currently claim motherboard resources in the middle, after attempting to
claim PCI resources and before assigning new PCI resources, and changing
that is a fairly big job.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16263
Reported-by: Andrew <nitr0@seti.kr.ua>
Tested-by: Andrew <nitr0@seti.kr.ua>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6
* 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6:
virtio-pci: disable msi at startup
virtio: return ENOMEM on out of memory
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI/PM: Do not use native PCIe PME by default
|
|
virtio-pci resets the device at startup by writing to the status
register, but this does not clear the pci config space,
specifically msi enable status which affects register
layout.
This breaks things like kdump when they try to use e.g. virtio-blk.
Fix by forcing msi off at startup. Since pci.c already has
a routine to do this, we export and use it instead of duplicating code.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
|
|
Commit c7f486567c1d0acd2e4166c47069835b9f75e77b
(PCI PM: PCIe PME root port service driver) causes the native PCIe
PME signaling to be used by default, if the BIOS allows the kernel to
control the standard configuration registers of PCIe root ports.
However, the native PCIe PME is coupled to the native PCIe hotplug
and calling pcie_pme_acpi_setup() makes some BIOSes expect that
the native PCIe hotplug will be used as well. That, in turn, causes
problems to appear on systems where the PCIe hotplug driver is not
loaded. The usual symptom, as reported by Jaroslav Kameník and
others, is that the ACPI GPE associated with PCIe hotplug keeps
firing continuously causing kacpid to take substantial percentage
of CPU time.
To work around this issue, change the default so that the native
PCIe PME signaling is only used if directly requested with the help
of the pcie_pme= command line switch.
Fixes https://bugzilla.kernel.org/show_bug.cgi?id=15924 , which is
a listed regression from 2.6.33.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Jaroslav Kameník <jaroslav@kamenik.cz>
Tested-by: Antoni Grzymala <antekgrzymala@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Certain revisions of this chipset appear to be broken. There is a shadow
GTT which mirrors the real GTT but contains pre-translated physical
addresses, for performance reasons. When a GTT update happens, the
translations are done once and the resulting physical addresses written
back to the shadow GTT.
Except sometimes, the physical address is actually written back to the
_real_ GTT, not the shadow GTT. Thus we start to see faults when that
physical address is fed through translation again.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
stanse found the following double lock.
In get_domain_for_dev:
spin_lock_irqsave(&device_domain_lock, flags);
domain_exit(domain);
domain_remove_dev_info(domain);
spin_lock_irqsave(&device_domain_lock, flags);
spin_unlock_irqrestore(&device_domain_lock, flags);
spin_unlock_irqrestore(&device_domain_lock, flags);
This happens when the domain is created by another CPU at the same time
as this function is creating one, and the other CPU wins the race to
attach it to the device in question, so we have to destroy our own
newly-created one.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Commit a99c47a2 "intel-iommu: errors with smaller iommu widths" replace the
dmar_domain->pgd with the first entry of page table when iommu's supported
width is smaller than dmar_domain's. But it use physical address directly
for new dmar_domain->pgd...
This result in KVM oops with VT-d on some machines.
Reported-by: Allen Kay <allen.m.kay@intel.com>
Cc: Tom Lyon <pugs@cisco.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: clear bridge resource range if BIOS assigned bad one
PCI: hotplug/cpqphp, fix NULL dereference
Revert "PCI: create function symlinks in /sys/bus/pci/slots/N/"
PCI: change resource collision messages from KERN_ERR to KERN_INFO
|
|
There are devices out there which are PCI Hot-plug controllers with
compaq PCI IDs, but are not bridges, hence have pdev->subordinate
NULL. But cpqphp expects the pointer to be non-NULL.
Add a check to the probe function to avoid oopses like:
BUG: unable to handle kernel NULL pointer dereference at 00000050
IP: [<f82e3c41>] cpqhpc_probe+0x951/0x1120 [cpqphp]
*pdpt = 0000000033779001 *pde = 0000000000000000
...
The device here was:
00:0b.0 PCI Hot-plug controller [0804]: Compaq Computer Corporation PCI Hotplug Controller [0e11:a0f7] (rev 11)
Subsystem: Compaq Computer Corporation Device [0e11:a2f8]
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
This reverts commit 75568f8094eb0333e9c2109b23cbc8b82d318a3c.
Since they're just a convenience anyway, remove these symlinks since
they're causing duplicate filename errors in the wild.
Acked-by: Alex Chiang <achiang@canonical.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
We can often deal with PCI resource issues by moving devices around. In
that case, there's no point in alarming the user with messages like these.
There are many bug reports where the message itself is the only problem,
e.g., https://bugs.launchpad.net/ubuntu/+source/linux/+bug/413419 .
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
JMB362 is a new variant of jmicron controller which is similar to
JMB360 but has two SATA ports instead of one. As there is no PATA
port, single function AHCI mode can be used as in JMB360. Add pci
quirk for JMB362.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Aries Lee <arieslee@jmicron.com>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (27 commits)
ACPI: Don't let acpi_pad needlessly mark TSC unstable
drivers/acpi/sleep.h: Checkpatch cleanup
ACPI: Minor cleanup eliminating redundant PMTIMER_TICKS to NS conversion
ACPI: delete unused c-state promotion/demotion data strucutures
ACPI: video: fix acpi_backlight=video
ACPI: EC: Use kmemdup
drivers/acpi: use kasprintf
ACPI, APEI, EINJ injection parameters support
Add x64 support to debugfs
ACPI, APEI, Use ERST for persistent storage of MCE
ACPI, APEI, Error Record Serialization Table (ERST) support
ACPI, APEI, Generic Hardware Error Source memory error support
ACPI, APEI, UEFI Common Platform Error Record (CPER) header
Unified UUID/GUID definition
ACPI Hardware Error Device (PNP0C33) support
ACPI, APEI, PCIE AER, use general HEST table parsing in AER firmware_first setup
ACPI, APEI, Document for APEI
ACPI, APEI, EINJ support
ACPI, APEI, HEST table parsing
ACPI, APEI, APEI supporting infrastructure
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (36 commits)
PCI: hotplug: pciehp: Removed check for hotplug of display devices
PCI: read memory ranges out of Broadcom CNB20LE host bridge
PCI: Allow manual resource allocation for PCI hotplug bridges
x86/PCI: make ACPI MCFG reserved error messages ACPI specific
PCI hotplug: Use kmemdup
PM/PCI: Update PCI power management documentation
PCI: output FW warning in pci_read/write_vpd
PCI: fix typos pci_device_dis/enable to pci_dis/enable_device in comments
PCI quirks: disable msi on AMD rs4xx internal gfx bridges
PCI: Disable MSI for MCP55 on P5N32-E SLI
x86/PCI: irq and pci_ids patch for additional Intel Cougar Point DeviceIDs
PCI: aerdrv: trivial cleanup for aerdrv_core.c
PCI: aerdrv: trivial cleanup for aerdrv.c
PCI: aerdrv: introduce default_downstream_reset_link
PCI: aerdrv: rework find_aer_service
PCI: aerdrv: remove is_downstream
PCI: aerdrv: remove magical ROOT_ERR_STATUS_MASKS
PCI: aerdrv: redefine PCI_ERR_ROOT_*_SRC
PCI: aerdrv: rework do_recovery
PCI: aerdrv: rework get_e_source()
...
|
|
* git://git.infradead.org/iommu-2.6:
intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables
intel-iommu: Combine the BIOS DMAR table warning messages
panic: Add taint flag TAINT_FIRMWARE_WORKAROUND ('I')
panic: Allow warnings to set different taint flags
intel-iommu: intel_iommu_map_range failed at very end of address space
intel-iommu: errors with smaller iommu widths
intel-iommu: Fix boot inside 64bit virtualbox with io-apic disabled
intel-iommu: use physfn to search drhd for VF
intel-iommu: Print out iommu seq_id
intel-iommu: Don't complain that ACPI_DMAR_SCOPE_TYPE_IOAPIC is not supported
intel-iommu: Avoid global flushes with caching mode.
intel-iommu: Use correct domain ID when caching mode is enabled
intel-iommu mistakenly uses offset_pfn when caching mode is enabled
intel-iommu: use for_each_set_bit()
intel-iommu: Fix section mismatch dmar_ir_support() uses dmar_tbl.
|
|
Removed check to prevent hotplug of display devices within pciehp.
Originally this was thought to have been required within the PCI
Hotplug specification for some legacy devices. However there is
no such requirement in the most recent revision. The check prevents
hotplug of not only display devices but also computational GPUs
which require serviceability.
Signed-off-by: Praveen Kalamegham <praveen@nextio.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
At the moment only PCI-E briges can be flagged as hotplug, thus
allowing manual resource preallocation via pci=hpmemsize=nnM and
pci=hpiosize=nnM kernel parameters. Some PCI hotplug bridges, e.g.
PLX 6254 can also benefit from this functionalily, as kernel fails
to properly allocate their resources when hotplug device is added
and PCI bus is rescanned.
This patch adds header quirk for PLX 6254 that marks this bridge
as hotplug. Other PCI bridges with similar problems can use it
as well.
Signed-off-by: Felix Radensky <felix@embedded-sol.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
The PCI config space bin_attr read handler has a hardcoded CAP_SYS_ADMIN
check to verify privileges before allowing a user to read device
dependent config space. This is meant to protect from an unprivileged
user potentially locking up the box.
When assigning a PCI device directly to a guest with libvirt and KVM,
the sysfs config space file is chown'd to the unprivileged user that
the KVM guest will run as. The guest needs to have full access to the
device's config space since it's responsible for driving the device.
However, despite being the owner of the sysfs file, the CAP_SYS_ADMIN
check will not allow read access beyond the config header.
With this patch we check privileges against the capabilities used when
openining the sysfs file. The allows a privileged process to open the
file and hand it to an unprivileged process, and the unprivileged process
can still read all of the config space.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
This allows bin_attr->read,write,mmap callbacks to check file specific data
(such as inode owner) as part of any privilege validation.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (44 commits)
vlynq: make whole Kconfig-menu dependant on architecture
add descriptive comment for TIF_MEMDIE task flag declaration.
EEPROM: max6875: Header file cleanup
EEPROM: 93cx6: Header file cleanup
EEPROM: Header file cleanup
agp: use NULL instead of 0 when pointer is needed
rtc-v3020: make bitfield unsigned
PCI: make bitfield unsigned
jbd2: use NULL instead of 0 when pointer is needed
cciss: fix shadows sparse warning
doc: inode uses a mutex instead of a semaphore.
uml: i386: Avoid redefinition of NR_syscalls
fix "seperate" typos in comments
cocbalt_lcdfb: correct sections
doc: Change urls for sparse
Powerpc: wii: Fix typo in comment
i2o: cleanup some exit paths
Documentation/: it's -> its where appropriate
UML: Fix compiler warning due to missing task_struct declaration
UML: add kernel.h include to signal.c
...
|
|
Now, a dedicated HEST tabling parsing code is used for PCIE AER
firmware_first setup. It is rebased on general HEST tabling parsing
code of APEI. The firmware_first setup code is moved from PCI core to
AER driver too, because it is only AER related.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
We now know how to deal with these tables so that they are harmless.
Set TAINT_FIRMWARE_WORKAROUND instead of the default TAINT_WARN.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
We have nearly the same code for warnings repeated four times. Move
it into a separate function.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Use kmemdup when some other buffer is immediately copied into the
allocated region.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression from,to,size,flag;
statement S;
@@
- to = \(kmalloc\|kzalloc\)(size,flag);
+ to = kmemdup(from,size,flag);
if (to==NULL || ...) S
- memcpy(to, from, size);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
pci_read/write_vpd() can fail due to a timeout. Usually the command
times out because of firmware issues (incorrect vpd length, etc.) on the
PCI card. Currently, the timeout occurs silently.
Output a message to the user indicating that they should check with
their vendor for new firmware.
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
This fixes all occurrences of pci_enable_device and pci_disable_device
in all comments. There are no code changes involved.
Signed-off-by: Roman Fietze <roman.fietze@telemotive.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Doesn't work reliably for internal gfx. Fixes kernel bug
https://bugzilla.kernel.org/show_bug.cgi?id=15626.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: Stable <stable@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
As reported in <http://bugs.debian.org/552299>, MSI appears to be
broken for this on-board device. We already have a quirk for the
P5N32-SLI Premium; extend it to cover both variants of the board.
Reported-by: Romain DEGEZ <romain.degez@smartjog.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/amd-iommu: Add amd_iommu=off command line option
iommu-api: Remove iommu_{un}map_range functions
x86/amd-iommu: Implement ->{un}map callbacks for iommu-api
x86/amd-iommu: Make amd_iommu_iova_to_phys aware of multiple page sizes
x86/amd-iommu: Make iommu_unmap_page and fetch_pte aware of page sizes
x86/amd-iommu: Make iommu_map_page and alloc_pte aware of page sizes
kvm: Change kvm_iommu_map_pages to map large pages
VT-d: Change {un}map_range functions to implement {un}map interface
iommu-api: Add ->{un}map callbacks to iommu_ops
iommu-api: Add iommu_map and iommu_unmap functions
iommu-api: Rename ->{un}map function pointers to ->{un}map_range
|
|
intel_iommu_map_range() doesn't allow allocation at the very end of the
address space; that code has been simplified and corrected.
Signed-off-by: Tom Lyon <pugs@cisco.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
When using iommu_domain_alloc with the Intel iommu, the domain address
width is always initialized to 48 bits (agaw 2). This domain->agaw value
is then used by pfn_to_dma_pte to (always) build a 4 level page table.
However, not all systems support iommu width of 48 or 4 level page tables.
In particular, the Core i5-660 and i5-670 support an address width of 36
bits (not 39!), an agaw of only 1, and only 3 level page tables.
This version of the patch simply lops off extra levels of the page tables
if the agaw value of the iommu is less than what is currently allocated
for the domain (in intel_iommu_attach_device). If there were already
allocated addresses above what the new iommu can handle, EFAULT is
returned.
Signed-off-by: Tom Lyon <pugs@cisco.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
unssigned"
This reverts commit 977d17bb1749517b353874ccdc9b85abc7a58c2a, because it
can cause problems with some devices not getting any resources at all
when the resource tree is re-allocated.
For an example of this, see
https://bugzilla.kernel.org/show_bug.cgi?id=15960
(originally https://bugtrack.alsa-project.org/alsa-bug/view.php?id=4982)
(lkml thread: http://lkml.org/lkml/2010/4/19/20)
where Peter Henriksson reported his Xonar DX sound card gone, because
the IO port region was no longer allocated.
Reported-bisected-and-tested-by: Peter Henriksson <peter.henriksson@gmail.com>
Requested-by: Andrew Morton <akpm@linux-foundation.org>
Requested-by: Clemens Ladisch <clemens@ladisch.de>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Style cleanup for pci_{en,dis}able_pcie_error_reporting().
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Skip zero-ing in aer_alloc_rpc() since it is allocated by kzalloc().
The closing comment marker "*/" is recommended for kernel-doc comments.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
I noticed that when I inject a fatal error to an endpoint via
aer-inject, aer_root_reset() is called as reset_link for a
downstream port at upstream of the endpoint:
pcieport 0000:00:06.0: AER: Uncorrected (Fatal) error received: id=5401
:
pcieport 0000:52:02.0: Root Port link has been reset
It externally appears to be working, but internally issues some
accesses to PCI_ERR_ROOT_COMMAND/STATUS registers that is for
root port so not available on downstream port.
This patch introduces default_downstream_reset_link that is
a version of aer_root_reset() with no accesses to root port's
register. It is used for downstream ports that has no reset_link
function its specific.
This patch also updates related description in pcieaer-howto.txt.
Some minor fixes are included.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
The structure find_aer_service_data is no longer useful.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
The pcie->port of port service device points the port associated
the service with. The find_aer_service iterates over children of
given port udev.
So it is clear that the pcie->port of port service of given port
udev must always point the udev.
Therefore we can know the type of udev without checking its children.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Make it clear that we only interest in 2 *_RCV bits.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
The Error Source Identification Register (Offset 34h) is 4 byte
which contains a couple of 2 byte field, "[15:0] ERR_COR Source
Identification" and "[31:16] ERR_FATAL/NONFATAL Source Identification."
This patch defines PCI_ERR_ROOT_ERR_SRC to make dword access sensible.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Move dev_printks for debug into do_recovery().
This allows do_recovery() to return void.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Current get_e_source() returns pointer to an element of array.
However since it also progress consume counter, it is possible
that the element is overwritten by newly produced data before
the element is really consumed.
This patch changes get_e_source() to copy contents of the element
to address pointed by its caller. Once copied the element in
array can be consumed.
And relocate this function to more innocuous place.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Divide tricky for-loop into readable if-blocks.
The logic to set multi_error_valid (to force walking pci bus
hierarchy to find 2nd~ error devices) is changed too, to check
MULTI_{,_UN}COR_RCV bit individually and to force walk only when
it is required.
And rework setting e_info->severity for uncorrectable, not to use
magic numbers.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Stop iteration if we cannot register any more.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Inline too-simple subroutine only used here.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Take core part of find_device_iter() to make a new function
is_error_source() that checks given device has report an error
or not.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Return bool to indicate that the source device is found or not.
This allows us to skip calling aer_process_err_devices() if we can.
And move dev_printk for debug into this function.
v2: return bool instead of int
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
These functions are only called from init/remove path of aerdrv,
so move them from aerdrv_core.c to aerdrv.c, to make them static.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
This cleanup solves some minor naming issues by removing unuseful
function aer_delete_rootport() and by renaming disable_root_aer()
to aer_disable_rootport().
- Inconsistent location of alloc & free:
The struct rpc is allocated in aer_alloc_rpc() at aerdrv.c
while it is implicitly freed in aer_delete_rootport() at
aerdrv_core.c.
- Inconsistent function name:
It makes a bit confusion that aer_delete_rootport() is seemed
to be paired with aer_enable_rootport(), i.e. there is neither
"add" against "delete" nor "disable" against "enable".
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|