Age | Commit message (Collapse) | Author |
|
commit a66307d473077b7aeba74e9b09c841ab3d399c2d upstream.
The ASMedia 1092 has a configuration mode which will present a
dummy device; sadly the implementation falsely claims to provide
a device with 100M which doesn't actually exist.
So disable this device to avoid errors during boot.
Cc: stable@vger.kernel.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6f48394cf1f3e8486591ad98c11cdadb8f1ef2ad upstream.
Trying to remove the fsl-sata module in the PPC64 GNU/Linux
leads to the following warning:
------------[ cut here ]------------
remove_proc_entry: removing non-empty directory 'irq/69',
leaking at least 'fsl-sata[ff0221000.sata]'
WARNING: CPU: 3 PID: 1048 at fs/proc/generic.c:722
.remove_proc_entry+0x20c/0x220
IRQMASK: 0
NIP [c00000000033826c] .remove_proc_entry+0x20c/0x220
LR [c000000000338268] .remove_proc_entry+0x208/0x220
Call Trace:
.remove_proc_entry+0x208/0x220 (unreliable)
.unregister_irq_proc+0x104/0x140
.free_desc+0x44/0xb0
.irq_free_descs+0x9c/0xf0
.irq_dispose_mapping+0x64/0xa0
.sata_fsl_remove+0x58/0xa0 [sata_fsl]
.platform_drv_remove+0x40/0x90
.device_release_driver_internal+0x160/0x2c0
.driver_detach+0x64/0xd0
.bus_remove_driver+0x70/0xf0
.driver_unregister+0x38/0x80
.platform_driver_unregister+0x14/0x30
.fsl_sata_driver_exit+0x18/0xa20 [sata_fsl]
---[ end trace 0ea876d4076908f5 ]---
The driver creates the mapping by calling irq_of_parse_and_map(),
so it also has to dispose the mapping. But the easy way out is to
simply use platform_get_irq() instead of irq_of_parse_map(). Also
we should adapt return value checking and propagate error values.
In this case the mapping is not managed by the device but by
the of core, so the device has not to dispose the mapping.
Fixes: faf0b2e5afe7 ("drivers/ata: add support to Freescale 3.0Gbps SATA Controller")
Cc: stable@vger.kernel.org
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6c8ad7e8cf29eb55836e7a0215f967746ab2b504 upstream.
When the `rmmod sata_fsl.ko` command is executed in the PPC64 GNU/Linux,
a bug is reported:
==================================================================
BUG: Unable to handle kernel data access on read at 0x80000800805b502c
Oops: Kernel access of bad area, sig: 11 [#1]
NIP [c0000000000388a4] .ioread32+0x4/0x20
LR [80000000000c6034] .sata_fsl_port_stop+0x44/0xe0 [sata_fsl]
Call Trace:
.free_irq+0x1c/0x4e0 (unreliable)
.ata_host_stop+0x74/0xd0 [libata]
.release_nodes+0x330/0x3f0
.device_release_driver_internal+0x178/0x2c0
.driver_detach+0x64/0xd0
.bus_remove_driver+0x70/0xf0
.driver_unregister+0x38/0x80
.platform_driver_unregister+0x14/0x30
.fsl_sata_driver_exit+0x18/0xa20 [sata_fsl]
.__se_sys_delete_module+0x1ec/0x2d0
.system_call_exception+0xfc/0x1f0
system_call_common+0xf8/0x200
==================================================================
The triggering of the BUG is shown in the following stack:
driver_detach
device_release_driver_internal
__device_release_driver
drv->remove(dev) --> platform_drv_remove/platform_remove
drv->remove(dev) --> sata_fsl_remove
iounmap(host_priv->hcr_base); <---- unmap
kfree(host_priv); <---- free
devres_release_all
release_nodes
dr->node.release(dev, dr->data) --> ata_host_stop
ap->ops->port_stop(ap) --> sata_fsl_port_stop
ioread32(hcr_base + HCONTROL) <---- UAF
host->ops->host_stop(host)
The iounmap(host_priv->hcr_base) and kfree(host_priv) functions should
not be executed in drv->remove. These functions should be executed in
host_stop after port_stop. Therefore, we move these functions to the
new function sata_fsl_host_stop and bind the new function to host_stop.
Fixes: faf0b2e5afe7 ("drivers/ata: add support to Freescale 3.0Gbps SATA Controller")
Cc: stable@vger.kernel.org
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 68dbbe7d5b4fde736d104cbbc9a2fce875562012 upstream.
Some ATA drives are very slow to respond to READ_LOG_EXT and
READ_LOG_DMA_EXT commands issued from ata_dev_configure() when the
device is revalidated right after resuming a system or inserting the
ATA adapter driver (e.g. ahci). The default 5s timeout
(ATA_EH_CMD_DFL_TIMEOUT) used for these commands is too short, causing
errors during the device configuration. Ex:
...
ata9: SATA max UDMA/133 abar m524288@0x9d200000 port 0x9d200400 irq 209
ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata9.00: ATA-9: XXX XXXXXXXXXXXXXXX, XXXXXXXX, max UDMA/133
ata9.00: qc timeout (cmd 0x2f)
ata9.00: Read log page 0x00 failed, Emask 0x4
ata9.00: Read log page 0x00 failed, Emask 0x40
ata9.00: NCQ Send/Recv Log not supported
ata9.00: Read log page 0x08 failed, Emask 0x40
ata9.00: 27344764928 sectors, multi 16: LBA48 NCQ (depth 32), AA
ata9.00: Read log page 0x00 failed, Emask 0x40
ata9.00: ATA Identify Device Log not supported
ata9.00: failed to set xfermode (err_mask=0x40)
ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata9.00: configured for UDMA/133
...
The timeout error causes a soft reset of the drive link, followed in
most cases by a successful revalidation as that give enough time to the
drive to become fully ready to quickly process the read log commands.
However, in some cases, this also fails resulting in the device being
dropped.
Fix this by using adding the ata_eh_revalidate_timeouts entries for the
READ_LOG_EXT and READ_LOG_DMA_EXT commands. This defines a timeout
increased to 15s, retriable one time.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a0023bb9dd9bc439d44604eeec62426a990054cd upstream.
mv_init_host() propagates the value returned by mv_chip_id() which in turn
gets propagated by mv_pci_init_one() and hits local_pci_probe().
During the process of driver probing, the probe function should return < 0
for failure, otherwise, the kernel will treat value > 0 as success.
Since this is a bug rather than a recoverable runtime error we should
use dev_alert() instead of dev_err().
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 013923477cb311293df9079332cf8b806ed0e6f2 upstream.
The last byte of "pad" is used without being initialized.
Fixes: 55dba3120fbc ("libata: update ->data_xfer hook for ATAPI")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7a8526a5cd51cf5f070310c6c37dd7293334ac49 upstream.
Many users are reporting that the Samsung 860 and 870 SSD are having
various issues when combined with AMD/ATI (vendor ID 0x1002) SATA
controllers and only completely disabling NCQ helps to avoid these
issues.
Always disabling NCQ for Samsung 860/870 SSDs regardless of the host
SATA adapter vendor will cause I/O performance degradation with well
behaved adapters. To limit the performance impact to ATI adapters,
introduce the ATA_HORKAGE_NO_NCQ_ON_ATI flag to force disable NCQ
only for these adapters.
Also, two libata.force parameters (noncqati and ncqati) are introduced
to disable and enable the NCQ for the system which equipped with ATI
SATA adapter and Samsung 860 and 870 SSDs. The user can determine NCQ
function to be enabled or disabled according to the demand.
After verifying the chipset from the user reports, the issue appears
on AMD/ATI SB7x0/SB8x0/SB9x0 SATA Controllers and does not appear on
recent AMD SATA adapters. The vendor ID of ATI should be 0x1002.
Therefore, ATA_HORKAGE_NO_NCQ_ON_AMD was modified to
ATA_HORKAGE_NO_NCQ_ON_ATI.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=201693
Signed-off-by: Kate Hsuan <hpa@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210903094411.58749-1-hpa@redhat.com
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Krzysztof Olędzki <ole@ans.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8a6430ab9c9c87cb64c512e505e8690bbaee190b upstream.
Commit ca6bfcb2f6d9 ("libata: Enable queued TRIM for Samsung SSD 860")
limited the existing ATA_HORKAGE_NO_NCQ_TRIM quirk from "Samsung SSD 8*",
covering all Samsung 800 series SSDs, to only apply to "Samsung SSD 840*"
and "Samsung SSD 850*" series based on information from Samsung.
But there is a large number of users which is still reporting issues
with the Samsung 860 and 870 SSDs combined with Intel, ASmedia or
Marvell SATA controllers and all reporters also report these problems
going away when disabling queued trims.
Note that with AMD SATA controllers users are reporting even worse
issues and only completely disabling NCQ helps there, this will be
addressed in a separate patch.
Fixes: ca6bfcb2f6d9 ("libata: Enable queued TRIM for Samsung SSD 860")
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=203475
Cc: stable@vger.kernel.org
Cc: Kate Hsuan <hpa@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20210823095220.30157-1-hdegoede@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 355a8031dc174450ccad2a61c513ad7222d87a97 ]
The loop on entry of ata_host_start() may not initialize host->ops to a
non NULL value. The test on the host_stop field of host->ops must then
be preceded by a check that host->ops is not NULL.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210816014456.2191776-3-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit f6bca4d91b2ea052e917cca3f9d866b5cc1d500a upstream.
DIPM is unsupported or broken on sunxi. Trying to enable the power
management policy med_power_with_dipm on an Allwinner A20 SoC based board
leads to immediate I/O errors and the attached SATA disk disappears from
the /dev filesystem. A reset (power cycle) is required to make the SATA
controller or disk work again. The A10 and A20 SoC data sheets and manuals
don't mention DIPM at all [1], so it's fair to assume that it's simply not
supported. But even if it was, it should be considered broken and best be
disabled in the ahci_sunxi driver.
[1] https://github.com/allwinner-zh/documents/tree/master/
Fixes: c5754b5220f0 ("ARM: sunxi: Add support for Allwinner SUNXi SoCs sata to ahci_platform")
Cc: stable@vger.kernel.org
Signed-off-by: Timo Sigurdsson <public_timo.s@silentcreek.de>
Tested-by: Timo Sigurdsson <public_timo.s@silentcreek.de>
Link: https://lore.kernel.org/r/20210614072539.3307-1-public_timo.s@silentcreek.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5c8121262484d99bffb598f39a0df445cecd8efb ]
The driver overrides the error codes returned by platform_get_irq() to
-ENXIO, so if it returns -EPROBE_DEFER, the driver would fail the probe
permanently instead of the deferred probing. Propagate the error code
upstream, as it should have been done from the start...
Fixes: 2fff27512600 ("PATA host controller driver for ep93xx")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Link: https://lore.kernel.org/r/509fda88-2e0d-2cc7-f411-695d7e94b136@omprussia.ru
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bfc1f378c8953e68ccdbfe0a8c20748427488b80 ]
Iff platform_get_irq() fails (or returns IRQ0) and thus the polling mode
has to be used, ata_host_activate() hits the WARN_ON() due to 'irq_handler'
parameter being non-NULL if the polling mode is selected. Let's only set
the pointer to the driver's IRQ handler if platform_get_irq() returns a
valid IRQ # -- this should avoid the unnecessary WARN_ON()...
Fixes: 43f01da0f279 ("MIPS/OCTEON/ata: Convert pata_octeon_cf.c to use device tree.")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/3a241167-f84d-1d25-5b9b-be910afbe666@omp.ru
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2d3a62fbae8e5badc2342388f65ab2191c209cc0 ]
The driver overrides the error codes returned by platform_get_irq() to
-ENOENT, so if it returns -EPROBE_DEFER, the driver would fail the probe
permanently instead of the deferred probing. Switch to propagating the
error code upstream, still checking/overriding IRQ0 as libata regards it
as "no IRQ" (thus polling) anyway...
Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Link: https://lore.kernel.org/r/771ced55-3efb-21f5-f21c-b99920aae611@omprussia.ru
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4a24efa16e7db02306fb5db84518bb0a7ada5a46 ]
The driver overrides the error codes returned by platform_get_irq() to
-EINVAL, so if it returns -EPROBE_DEFER, the driver would fail the probe
permanently instead of the deferred probing. Switch to propagating the
error code upstream, still checking/overriding IRQ0 as libata regards it
as "no IRQ" (thus polling) anyway...
Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Link: https://lore.kernel.org/r/105b456d-1199-f6e9-ceb7-ffc5ba551d1a@omprussia.ru
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b30d0040f06159de97ad9c0b1536f47250719d7d ]
Iff platform_get_irq() returns 0, ahci_platform_init_host() would return 0
early (as if the call was successful). Override IRQ0 with -EINVAL instead
as the 'libata' regards 0 as "no IRQ" (thus polling) anyway...
Fixes: c034640a32f8 ("ata: libahci: properly propagate return value of platform_get_irq()")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Link: https://lore.kernel.org/r/4448c8cc-331f-2915-0e17-38ea34e251c8@omprussia.ru
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e6471a65fdd5efbb8dd2732dd0f063f960685ceb ]
The function mv_platform_probe() neglects to check the results of the
calls to platform_get_irq() and irq_of_parse_and_map() and blithely
passes them to ata_host_activate() -- while the latter only checks
for IRQ0 (treating it as a polling mode indicattion) and passes the
negative values to devm_request_irq() causing it to fail as it takes
unsigned values for the IRQ #...
Add to mv_platform_probe() the proper IRQ checks to pass the positive IRQ
#s to ata_host_activate(), propagate upstream the negative error codes,
and override the IRQ0 with -EINVAL (as we don't want the polling mode).
Fixes: f351b2d638c3 ("sata_mv: Support SoC controllers")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Link: https://lore.kernel.org/r/51436f00-27a1-e20b-c21b-0e817e0a7c86@omprussia.ru
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e379b40cc0f179403ce0b82b7e539f635a568da5 ]
The driver's probe() method is written as if platform_get_irq() returns 0
on error, while actually it returns a negative error code (with all the
other values considered valid IRQs). Rewrite the driver's IRQ checking
code to pass the positive IRQ #s to ata_host_activate(), propagate errors
upstream, and treat IRQ0 as error, returning -EINVAL, as the libata code
treats 0 as an indication that polling should be used anyway...
Fixes: 0df0d0a0ea9f ("[libata] ARM: add ixp4xx PATA driver")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c7e8f404d56b99c80990b19a402c3f640d74be05 ]
The driver's probe() method is written as if platform_get_irq() returns 0
on error, while actually it returns a negative error code (with all the
other values considered valid IRQs). Rewrite the driver's IRQ checking code
to pass the positive IRQ #s to ata_host_activate(), propagate upstream
-EPROBE_DEFER, and set up the driver to polling mode on (negative) errors
and IRQ0 (libata treats IRQ #0 as a polling mode anyway)...
Fixes: a480167b23ef ("pata_arasan_cf: Adding support for arasan compact flash host controller")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit df9c590986fdb6db9d5636d6cd93bc919c01b451 upstream.
Before commit 9495b7e92f716ab2 ("driver core: platform: Initialize
dma_parms for platform devices"), the R-Car SATA device didn't have DMA
parameters. Hence the DMA boundary mask supplied by its driver was
silently ignored, as __scsi_init_queue() doesn't check the return value
of dma_set_seg_boundary(), and the default value of 0xffffffff was used.
Now the device has gained DMA parameters, the driver-supplied value is
used, and the following warning is printed on Salvator-XS:
DMA-API: sata_rcar ee300000.sata: mapping sg segment across boundary [start=0x00000000ffffe000] [end=0x00000000ffffefff] [boundary=0x000000001ffffffe]
WARNING: CPU: 5 PID: 38 at kernel/dma/debug.c:1233 debug_dma_map_sg+0x298/0x300
(the range of start/end values depend on whether IOMMU support is
enabled or not)
The issue here is that SATA_RCAR_DMA_BOUNDARY doesn't have bit 0 set, so
any typical end value, which is odd, will trigger the check.
Fix this by increasing the DMA boundary value by 1.
This also fixes the following WRITE DMA EXT timeout issue:
# dd if=/dev/urandom of=/mnt/de1/file1-1024M bs=1M count=1024
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata1.00: failed command: WRITE DMA EXT
ata1.00: cmd 35/00:00:00:e6:0c/00:0a:00:00:00/e0 tag 0 dma 1310720 out
res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
as seen by Shimoda-san since commit 429120f3df2dba2b ("block: fix
splitting segments on boundary masks").
Fixes: 8bfbeed58665dbbf ("sata_rcar: correct 'sata_rcar_sht'")
Fixes: 9495b7e92f716ab2 ("driver core: platform: Initialize dma_parms for platform devices")
Fixes: 429120f3df2dba2b ("block: fix splitting segments on boundary masks")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e9f691d899188679746eeb96e6cb520459eda9b4 upstream.
There are several reports that the BUG_ON on unsupported command in
mv_qc_prep can be triggered under some circumstances:
https://bugzilla.suse.com/show_bug.cgi?id=1110252
https://serverfault.com/questions/888897/raid-problems-after-power-outage
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652185
https://bugs.centos.org/view.php?id=14998
Let sata_mv handle the failure gracefully: warn about that incl. the
failed command number and return an AC_ERR_INVALID error. We can do that
now thanks to the previous patch.
Remove also the long-standing FIXME.
[v2] use %.2x as commands are defined as hexa.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-ide@vger.kernel.org
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 95364f36701e62dd50eee91e1303187fd1a9f567 upstream.
In case a driver wants to return an error from qc_prep, return enum
ata_completion_errors. sata_mv is one of those drivers -- see the next
patch. Other drivers return the newly defined AC_ERR_OK.
[v2] use enum ata_completion_errors and AC_ERR_OK.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-ide@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b5292111de9bb70cba3489075970889765302136 ]
Commit 130f4caf145c ("libata: Ensure ata_port probe has completed before
detach") may cause system freeze during suspend.
Using async_synchronize_full() in PM callbacks is wrong, since async
callbacks that are already scheduled may wait for not-yet-scheduled
callbacks, causes a circular dependency.
Instead of using big hammer like async_synchronize_full(), use async
cookie to make sure port probe are synced, without affecting other
scheduled PM callbacks.
Fixes: 130f4caf145c ("libata: Ensure ata_port probe has completed before detach")
Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: John Garry <john.garry@huawei.com>
BugLink: https://bugs.launchpad.net/bugs/1867983
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 55e610cdd28c0ad3dce0652030c0296d549673f3 upstream.
This lock is already taken in ata_scsi_queuecmd() a few levels up the
call stack so attempting to take it here is an error. Moreover, it is
pointless in the first place since it only protects a single, atomic
assignment.
Enabling lock debugging gives the following output:
=============================================
[ INFO: possible recursive locking detected ]
4.4.0-rc5+ #189 Not tainted
---------------------------------------------
kworker/u2:3/37 is trying to acquire lock:
(&(&host->lock)->rlock){-.-...}, at: [<90283294>] sata_dwc_exec_command_by_tag.constprop.14+0x44/0x8c
but task is already holding lock:
(&(&host->lock)->rlock){-.-...}, at: [<902761ac>] ata_scsi_queuecmd+0x2c/0x330
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&host->lock)->rlock);
lock(&(&host->lock)->rlock);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by kworker/u2:3/37:
#0: ("events_unbound"){.+.+.+}, at: [<9003a0a4>] process_one_work+0x12c/0x430
#1: ((&entry->work)){+.+.+.}, at: [<9003a0a4>] process_one_work+0x12c/0x430
#2: (&bdev->bd_mutex){+.+.+.}, at: [<9011fd54>] __blkdev_get+0x50/0x380
#3: (&(&host->lock)->rlock){-.-...}, at: [<902761ac>] ata_scsi_queuecmd+0x2c/0x330
stack backtrace:
CPU: 0 PID: 37 Comm: kworker/u2:3 Not tainted 4.4.0-rc5+ #189
Workqueue: events_unbound async_run_entry_fn
Stack : 90b38e30 00000021 00000003 9b2a6040 00000000 9005f3f0 904fc8dc 00000025
906b96e4 00000000 90528648 9b3336c4 904fc8dc 9009bf18 00000002 00000004
00000000 00000000 9b3336c4 9b3336e4 904fc8dc 9003d074 00000000 90500000
9005e738 00000000 00000000 00000000 00000000 00000000 00000000 00000000
6e657665 755f7374 756f626e 0000646e 00000000 00000000 9b00ca00 9b025000
...
Call Trace:
[<90009d6c>] show_stack+0x88/0xa4
[<90057744>] __lock_acquire+0x1ce8/0x2154
[<900583e4>] lock_acquire+0x64/0x8c
[<9045ff10>] _raw_spin_lock_irqsave+0x54/0x78
[<90283294>] sata_dwc_exec_command_by_tag.constprop.14+0x44/0x8c
[<90283484>] sata_dwc_qc_issue+0x1a8/0x24c
[<9026b39c>] ata_qc_issue+0x1f0/0x410
[<90273c6c>] ata_scsi_translate+0xb4/0x200
[<90276234>] ata_scsi_queuecmd+0xb4/0x330
[<9025800c>] scsi_dispatch_cmd+0xd0/0x128
[<90259934>] scsi_request_fn+0x58c/0x638
[<901a3e50>] __blk_run_queue+0x40/0x5c
[<901a83d4>] blk_queue_bio+0x27c/0x28c
[<901a5914>] generic_make_request+0xf0/0x188
[<901a5a54>] submit_bio+0xa8/0x194
[<9011adcc>] submit_bh_wbc.isra.23+0x15c/0x17c
[<9011c908>] block_read_full_page+0x3e4/0x428
[<9009e2e0>] do_read_cache_page+0xac/0x210
[<9009fd90>] read_cache_page+0x18/0x24
[<901bbd18>] read_dev_sector+0x38/0xb0
[<901bd174>] msdos_partition+0xb4/0x5c0
[<901bcb8c>] check_partition+0x140/0x274
[<901bba60>] rescan_partitions+0xa0/0x2b0
[<9011ff68>] __blkdev_get+0x264/0x380
[<901201ac>] blkdev_get+0x128/0x36c
[<901b9378>] add_disk+0x3c0/0x4bc
[<90268268>] sd_probe_async+0x100/0x224
[<90043a44>] async_run_entry_fn+0x50/0x124
[<9003a11c>] process_one_work+0x1a4/0x430
[<9003a4f4>] worker_thread+0x14c/0x4fc
[<900408f4>] kthread+0xd0/0xe8
[<90004338>] ret_from_kernel_thread+0x14/0x1c
Fixes: 62936009f35a ("[libata] Add 460EX on-chip SATA driver, sata_dwc_460ex")
Tested-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
ATA_DFLAG_DETACH is set
commit 8305f72f952cff21ce8109dc1ea4b321c8efc5af upstream.
During system resume from suspend, this can be observed on ASM1062 PMP
controller:
ata10.01: SATA link down (SStatus 0 SControl 330)
ata10.02: hard resetting link
ata10.02: SATA link down (SStatus 0 SControl 330)
ata10.00: configured for UDMA/133
Kernel panic - not syncing: stack-protector: Kernel
in: sata_pmp_eh_recover+0xa2b/0xa40
CPU: 2 PID: 230 Comm: scsi_eh_9 Tainted: P OE
#49-Ubuntu
Hardware name: System manufacturer System Product
1001 12/10/2017
Call Trace:
dump_stack+0x63/0x8b
panic+0xe4/0x244
? sata_pmp_eh_recover+0xa2b/0xa40
__stack_chk_fail+0x19/0x20
sata_pmp_eh_recover+0xa2b/0xa40
? ahci_do_softreset+0x260/0x260 [libahci]
? ahci_do_hardreset+0x140/0x140 [libahci]
? ata_phys_link_offline+0x60/0x60
? ahci_stop_engine+0xc0/0xc0 [libahci]
sata_pmp_error_handler+0x22/0x30
ahci_error_handler+0x45/0x80 [libahci]
ata_scsi_port_error_handler+0x29b/0x770
? ata_scsi_cmd_error_handler+0x101/0x140
ata_scsi_error+0x95/0xd0
? scsi_try_target_reset+0x90/0x90
scsi_error_handler+0xd0/0x5b0
kthread+0x121/0x140
? scsi_eh_get_sense+0x200/0x200
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x22/0x40
Kernel Offset: 0xcc00000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Since sata_pmp_eh_recover_pmp() doens't set rc when ATA_DFLAG_DETACH is
set, sata_pmp_eh_recover() continues to run. During retry it triggers
the stack protector.
Set correct rc in sata_pmp_eh_recover_pmp() to let sata_pmp_eh_recover()
jump to pmp_fail directly.
BugLink: https://bugs.launchpad.net/bugs/1821434
Cc: stable@vger.kernel.org
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 1d72f7aec3595249dbb83291ccac041a2d676c57 ]
If the call to scsi_add_host_with_dma() in ata_scsi_add_hosts() fails,
then we may get use-after-free KASAN warns:
==================================================================
BUG: KASAN: use-after-free in kobject_put+0x24/0x180
Read of size 1 at addr ffff0026b8c80364 by task swapper/0/1
CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W 5.6.0-rc3-00004-g5a71b206ea82-dirty #1765
Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD, BIOS 2280-V2 CS V3.B160.01 02/24/2020
Call trace:
dump_backtrace+0x0/0x298
show_stack+0x14/0x20
dump_stack+0x118/0x190
print_address_description.isra.9+0x6c/0x3b8
__kasan_report+0x134/0x23c
kasan_report+0xc/0x18
__asan_load1+0x5c/0x68
kobject_put+0x24/0x180
put_device+0x10/0x20
scsi_host_put+0x10/0x18
ata_devres_release+0x74/0xb0
release_nodes+0x2d0/0x470
devres_release_all+0x50/0x78
really_probe+0x2d4/0x560
driver_probe_device+0x7c/0x148
device_driver_attach+0x94/0xa0
__driver_attach+0xa8/0x110
bus_for_each_dev+0xe8/0x158
driver_attach+0x30/0x40
bus_add_driver+0x220/0x2e0
driver_register+0xbc/0x1d0
__pci_register_driver+0xbc/0xd0
ahci_pci_driver_init+0x20/0x28
do_one_initcall+0xf0/0x608
kernel_init_freeable+0x31c/0x384
kernel_init+0x10/0x118
ret_from_fork+0x10/0x18
Allocated by task 5:
save_stack+0x28/0xc8
__kasan_kmalloc.isra.8+0xbc/0xd8
kasan_kmalloc+0xc/0x18
__kmalloc+0x1a8/0x280
scsi_host_alloc+0x44/0x678
ata_scsi_add_hosts+0x74/0x268
ata_host_register+0x228/0x488
ahci_host_activate+0x1c4/0x2a8
ahci_init_one+0xd18/0x1298
local_pci_probe+0x74/0xf0
work_for_cpu_fn+0x2c/0x48
process_one_work+0x488/0xc08
worker_thread+0x330/0x5d0
kthread+0x1c8/0x1d0
ret_from_fork+0x10/0x18
Freed by task 5:
save_stack+0x28/0xc8
__kasan_slab_free+0x118/0x180
kasan_slab_free+0x10/0x18
slab_free_freelist_hook+0xa4/0x1a0
kfree+0xd4/0x3a0
scsi_host_dev_release+0x100/0x148
device_release+0x7c/0xe0
kobject_put+0xb0/0x180
put_device+0x10/0x20
scsi_host_put+0x10/0x18
ata_scsi_add_hosts+0x210/0x268
ata_host_register+0x228/0x488
ahci_host_activate+0x1c4/0x2a8
ahci_init_one+0xd18/0x1298
local_pci_probe+0x74/0xf0
work_for_cpu_fn+0x2c/0x48
process_one_work+0x488/0xc08
worker_thread+0x330/0x5d0
kthread+0x1c8/0x1d0
ret_from_fork+0x10/0x18
There is also refcount issue, as well:
WARNING: CPU: 1 PID: 1 at lib/refcount.c:28 refcount_warn_saturate+0xf8/0x170
The issue is that we make an erroneous extra call to scsi_host_put()
for that host:
So in ahci_init_one()->ata_host_alloc_pinfo()->ata_host_alloc(), we setup
a device release method - ata_devres_release() - which intends to release
the SCSI hosts:
static void ata_devres_release(struct device *gendev, void *res)
{
...
for (i = 0; i < host->n_ports; i++) {
struct ata_port *ap = host->ports[i];
if (!ap)
continue;
if (ap->scsi_host)
scsi_host_put(ap->scsi_host);
}
...
}
However in the ata_scsi_add_hosts() error path, we also call
scsi_host_put() for the SCSI hosts.
Fix by removing the the scsi_host_put() calls in ata_scsi_add_hosts() and
leave this to ata_devres_release().
Fixes: f31871951b38 ("libata: separate out ata_host_alloc() and ata_host_register()")
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 60fc35f327e0a9e60b955c0f3c3ed623608d1baa ]
The commit ed08d40cdec4
("ahci: Changing two module params with static and __read_mostly")
moved ahci_em_messages to be static while missing the fact of exporting it.
WARNING: "ahci_em_messages" [vmlinux] is a static EXPORT_SYMBOL_GPL
Drop export for the local variable ahci_em_messages.
Fixes: ed08d40cdec4 ("ahci: Changing two module params with static and __read_mostly")
Cc: Chuansheng Liu <chuansheng.liu@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 130f4caf145c3562108b245a576db30b916199d2 ]
With CONFIG_DEBUG_TEST_DRIVER_REMOVE set, we may find the following WARN:
[ 23.452574] ------------[ cut here ]------------
[ 23.457190] WARNING: CPU: 59 PID: 1 at drivers/ata/libata-core.c:6676 ata_host_detach+0x15c/0x168
[ 23.466047] Modules linked in:
[ 23.469092] CPU: 59 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc1-00010-g5b83fd27752b-dirty #296
[ 23.477776] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
[ 23.486286] pstate: a0c00009 (NzCv daif +PAN +UAO)
[ 23.491065] pc : ata_host_detach+0x15c/0x168
[ 23.495322] lr : ata_host_detach+0x88/0x168
[ 23.499491] sp : ffff800011cabb50
[ 23.502792] x29: ffff800011cabb50 x28: 0000000000000007
[ 23.508091] x27: ffff80001137f068 x26: ffff8000112c0c28
[ 23.513390] x25: 0000000000003848 x24: ffff0023ea185300
[ 23.518689] x23: 0000000000000001 x22: 00000000000014c0
[ 23.523987] x21: 0000000000013740 x20: ffff0023bdc20000
[ 23.529286] x19: 0000000000000000 x18: 0000000000000004
[ 23.534584] x17: 0000000000000001 x16: 00000000000000f0
[ 23.539883] x15: ffff0023eac13790 x14: ffff0023eb76c408
[ 23.545181] x13: 0000000000000000 x12: ffff0023eac13790
[ 23.550480] x11: ffff0023eb76c228 x10: 0000000000000000
[ 23.555779] x9 : ffff0023eac13798 x8 : 0000000040000000
[ 23.561077] x7 : 0000000000000002 x6 : 0000000000000001
[ 23.566376] x5 : 0000000000000002 x4 : 0000000000000000
[ 23.571674] x3 : ffff0023bf08a0bc x2 : 0000000000000000
[ 23.576972] x1 : 3099674201f72700 x0 : 0000000000400284
[ 23.582272] Call trace:
[ 23.584706] ata_host_detach+0x15c/0x168
[ 23.588616] ata_pci_remove_one+0x10/0x18
[ 23.592615] ahci_remove_one+0x20/0x40
[ 23.596356] pci_device_remove+0x3c/0xe0
[ 23.600267] really_probe+0xdc/0x3e0
[ 23.603830] driver_probe_device+0x58/0x100
[ 23.608000] device_driver_attach+0x6c/0x90
[ 23.612169] __driver_attach+0x84/0xc8
[ 23.615908] bus_for_each_dev+0x74/0xc8
[ 23.619730] driver_attach+0x20/0x28
[ 23.623292] bus_add_driver+0x148/0x1f0
[ 23.627115] driver_register+0x60/0x110
[ 23.630938] __pci_register_driver+0x40/0x48
[ 23.635199] ahci_pci_driver_init+0x20/0x28
[ 23.639372] do_one_initcall+0x5c/0x1b0
[ 23.643199] kernel_init_freeable+0x1a4/0x24c
[ 23.647546] kernel_init+0x10/0x108
[ 23.651023] ret_from_fork+0x10/0x18
[ 23.654590] ---[ end trace 634a14b675b71c13 ]---
With KASAN also enabled, we may also get many use-after-free reports.
The issue is that when CONFIG_DEBUG_TEST_DRIVER_REMOVE is set, we may
attempt to detach the ata_port before it has been probed.
This is because the ata_ports are async probed, meaning that there is no
guarantee that the ata_port has probed prior to detach. When the ata_port
does probe in this scenario, we get all sorts of issues as the detach may
have already happened.
Fix by ensuring synchronisation with async_synchronize_full(). We could
alternatively use the cookie returned from the ata_port probe
async_schedule() call, but that means managing the cookie, so more
complicated.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6adde4a36f1b6a562a1057fbb1065007851050e7 ]
Clang warns when one enumerated type is implicitly converted to another.
drivers/ata/pata_ep93xx.c:662:36: warning: implicit conversion from
enumeration type 'enum dma_data_direction' to different enumeration type
'enum dma_transfer_direction' [-Wenum-conversion]
drv_data->dma_rx_data.direction = DMA_FROM_DEVICE;
~ ^~~~~~~~~~~~~~~
drivers/ata/pata_ep93xx.c:670:36: warning: implicit conversion from
enumeration type 'enum dma_data_direction' to different enumeration type
'enum dma_transfer_direction' [-Wenum-conversion]
drv_data->dma_tx_data.direction = DMA_TO_DEVICE;
~ ^~~~~~~~~~~~~
drivers/ata/pata_ep93xx.c:681:19: warning: implicit conversion from
enumeration type 'enum dma_data_direction' to different enumeration type
'enum dma_transfer_direction' [-Wenum-conversion]
conf.direction = DMA_FROM_DEVICE;
~ ^~~~~~~~~~~~~~~
drivers/ata/pata_ep93xx.c:692:19: warning: implicit conversion from
enumeration type 'enum dma_data_direction' to different enumeration type
'enum dma_transfer_direction' [-Wenum-conversion]
conf.direction = DMA_TO_DEVICE;
~ ^~~~~~~~~~~~~
Use the equivalent valued enums from the expected type so that Clang no
longer warns about a conversion.
DMA_TO_DEVICE = DMA_MEM_TO_DEV = 1
DMA_FROM_DEVICE = DMA_DEV_TO_MEM = 2
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 2d7271501720038381d45fb3dcbe4831228fc8cc upstream.
For passthrough requests, libata-scsi takes what the user passes in
as gospel. This can be problematic if the user fills in the CDB
incorrectly. One example of that is in request sizes. For read/write
commands, the CDB contains fields describing the transfer length of
the request. These should match with the SG_IO header fields, but
libata-scsi currently does no validation of that.
Check that the number of blocks in the CDB for passthrough requests
matches what was mapped into the request. If the CDB asks for more
data then the validated SG_IO header fields, error it.
Reported-by: Krishna Ram Prakash R <krp@gtux.in>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 752ead44491e8c91e14d7079625c5916b30921c5 ]
Abort processing of a command if we run out of mapped data in the
SG list. This should never happen, but a previous bug caused it to
be possible. Play it safe and attempt to abort nicely if we don't
have more SG segments left.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 090bb803708198e5ab6b0046398c7ed9f4d12d6b ]
Retrieving PHYs can defer the probe, do not spawn an error when
-EPROBE_DEFER is returned, it is normal behavior.
Fixes: b1a9edbda040 ("ata: libahci: allow to use multiple PHYs")
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 71d6c505b4d9e6f76586350450e785e3d452b346 ]
Jeffrin reported a KASAN issue:
BUG: KASAN: global-out-of-bounds in ata_exec_internal_sg+0x50f/0xc70
Read of size 16 at addr ffffffff91f41f80 by task scsi_eh_1/149
...
The buggy address belongs to the variable:
cdb.48319+0x0/0x40
Much like commit 18c9a99bce2a ("libata: zpodd: small read overflow in
eject_tray()"), this fixes a cdb[] buffer length, this time in
zpodd_get_mech_type():
We read from the cdb[] buffer in ata_exec_internal_sg(). It has to be
ATAPI_CDB_LEN (16) bytes long, but this buffer is only 12 bytes.
Reported-by: Jeffrin Jose T <jeffrin@rajagiritech.edu.in>
Fixes: afe759511808c ("libata: identify and init ZPODD devices")
Link: https://lore.kernel.org/lkml/201907181423.E808958@keescook/
Tested-by: Jeffrin Jose T <jeffrin@rajagiritech.edu.in>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 31f6264e225fb92cf6f4b63031424f20797c297d upstream.
We've received a bugreport that using LPM with ST1000LM024 drives leads
to system lockups. So it seems that these models are buggy in more then
1 way. Add NOLPM quirk to the existing quirks entry for BROKEN_FPDMA_AA.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1571330
Cc: stable@vger.kernel.org
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit dd08a8d9a66de4b54575c294a92630299f7e0fe7 ]
When CONFIG_VMAP_STACK=y, __pa() returns incorrect physical address for
a stack virtual address. Stack DMA buffers must be avoided.
Signed-off-by: raymond pang <raymondpangxd@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin (Microsoft) <sashal@kernel.org>
|
|
[ Upstream commit 9f83cfdb1ace3ef268ecc6fda50058d2ec37d603 ]
The driver overrides the error codes returned by platform_get_irq() to
-EINVAL, so if it returns -EPROBE_DEFER, the driver would fail the probe
permanently instead of the deferred probing. Switch to propagating the
error code upstream, still checking/overriding IRQ0 as libata regards it
as "no IRQ" (thus polling) anyway...
Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq")
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit fd6f32f78645db32b6b95a42e45da2ddd6de0e67 ]
These devices support read zero after trim (RZAT), as they advertise to
the OS. However, the OS doesn't believe the SSDs unless they are
explicitly whitelisted.
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d312fefea8387503375f728855c9a62de20c9665 ]
ahci_pci_reset_controller() calls ahci_reset_controller(), which may
fail, but ignores the result code and always returns success. This
may result in failures like below
ahci 0000:02:00.0: version 3.0
ahci 0000:02:00.0: enabling device (0000 -> 0003)
ahci 0000:02:00.0: SSS flag set, parallel bus scan disabled
ahci 0000:02:00.0: controller reset failed (0xffffffff)
ahci 0000:02:00.0: failed to stop engine (-5)
... repeated many times ...
ahci 0000:02:00.0: failed to stop engine (-5)
Unable to handle kernel paging request at virtual address ffff0000093f9018
...
PC is at ahci_stop_engine+0x5c/0xd8 [libahci]
LR is at ahci_deinit_port.constprop.12+0x1c/0xc0 [libahci]
...
[<ffff000000a17014>] ahci_stop_engine+0x5c/0xd8 [libahci]
[<ffff000000a196b4>] ahci_deinit_port.constprop.12+0x1c/0xc0 [libahci]
[<ffff000000a197d8>] ahci_init_controller+0x80/0x168 [libahci]
[<ffff000000a260f8>] ahci_pci_init_controller+0x60/0x68 [ahci]
[<ffff000000a26f94>] ahci_init_one+0x75c/0xd88 [ahci]
[<ffff000008430324>] local_pci_probe+0x3c/0xb8
[<ffff000008431728>] pci_device_probe+0x138/0x170
[<ffff000008585e54>] driver_probe_device+0x2dc/0x458
[<ffff0000085860e4>] __driver_attach+0x114/0x118
[<ffff000008583ca8>] bus_for_each_dev+0x60/0xa0
[<ffff000008585638>] driver_attach+0x20/0x28
[<ffff0000085850b0>] bus_add_driver+0x1f0/0x2a8
[<ffff000008586ae0>] driver_register+0x60/0xf8
[<ffff00000842f9b4>] __pci_register_driver+0x3c/0x48
[<ffff000000a3001c>] ahci_pci_driver_init+0x1c/0x1000 [ahci]
[<ffff000008083918>] do_one_initcall+0x38/0x120
where an obvious hardware level failure results in an unnecessary 15 second
delay and a subsequent crash.
So record the result code of ahci_reset_controller() and relay it, rather
than ignoring it.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2dbb3ec29a6c069035857a2fc4c24e80e5dfe3cc ]
We have seen that on some platforms, SATA device never show any DEVSLP
residency. This prevent power gating of SATA IP, which prevent system
to transition to low power mode in systems with SLP_S0 aka modern
standby systems. The PHY logic is off only in DEVSLP not in slumber.
Reference:
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets
/332995-skylake-i-o-platform-datasheet-volume-1.pdf
Section 28.7.6.1
Here driver is trying to do read-modify-write the devslp register. But
not resetting the bits for which this driver will modify values (DITO,
MDAT and DETO). So simply reset those bits before updating to new values.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 804689ad2d9b66d0d3920b48cf05881049d44589 ]
For failed commands with valid sense data (e.g. NCQ commands),
scsi_check_sense() is used in ata_analyze_tf() to determine if the
command can be retried. In such case, rely on this decision and ignore
the command error mask based decision done in ata_worth_retry().
This fixes useless retries of commands such as unaligned writes on zoned
disks (TYPE_ZAC).
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 240630e61870e62e39a97225048f9945848fa5f5 upstream.
There have been several reports of LPM related hard freezes about once
a day on multiple Lenovo 50 series models. Strange enough these reports
where not disk model specific as LPM issues usually are and some users
with the exact same disk + laptop where seeing them while other users
where not seeing these issues.
It turns out that enabling LPM triggers a firmware bug somewhere, which
has been fixed in later BIOS versions.
This commit adds a new ahci_broken_lpm() function and a new ATA_FLAG_NO_LPM
for dealing with this.
The ahci_broken_lpm() function contains DMI match info for the 4 models
which are known to be affected by this and the DMI BIOS date field for
known good BIOS versions. If the BIOS date is older then the one in the
table LPM will be disabled and a warning will be printed.
Note the BIOS dates are for known good versions, some older versions may
work too, but we don't know for sure, the table is using dates from BIOS
versions for which users have confirmed that upgrading to that version
makes the problem go away.
Unfortunately I've been unable to get hold of the reporter who reported
that BIOS version 2.35 fixed the problems on the W541 for him. I've been
able to verify the DMI_SYS_VENDOR and DMI_PRODUCT_VERSION from an older
dmidecode, but I don't know the exact BIOS date as reported in the DMI.
Lenovo keeps a changelog with dates in their release notes, but the
dates there are the release dates not the build dates which are in DMI.
So I've chosen to set the date to which we compare to one day past the
release date of the 2.34 BIOS. I plan to fix this with a follow up
commit once I've the necessary info.
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2cfce3a86b64b53f0a70e92a6a659c720c319b45 upstream.
Commit 184add2ca23c ("libata: Apply NOLPM quirk for SanDisk
SD7UB3Q*G1001 SSDs") disabled LPM for SanDisk SD7UB3Q*G1001 SSDs.
This has lead to several reports of users of that SSD where LPM
was working fine and who know have a significantly increased idle
power consumption on their laptops.
Likely there is another problem on the T450s from the original
reporter which gets exposed by the uncore reaching deeper sleep
states (higher PC-states) due to LPM being enabled. The problem as
reported, a hardfreeze about once a day, already did not sound like
it would be caused by LPM and the reports of the SSD working fine
confirm this. The original reporter is ok with dropping the quirk.
A X250 user has reported the same hard freeze problem and for him
the problem went away after unrelated updates, I suspect some GPU
driver stack changes fixed things.
TL;DR: The original reporters problem were triggered by LPM but not
an LPM issue, so drop the quirk for the SSD in question.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1583207
Cc: stable@vger.kernel.org
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Lorenzo Dalrio <lorenzo.dalrio@gmail.com>
Reported-by: Lorenzo Dalrio <lorenzo.dalrio@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: "Richard W.M. Jones" <rjones@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 18c9a99bce2a57dfd7e881658703b5d7469cc7b9 upstream.
We read from the cdb[] buffer in ata_exec_internal_sg(). It has to be
ATAPI_CDB_LEN (16) bytes long, but this buffer is only 12 bytes.
Fixes: 213342053db5 ("libata: handle power transition of ODD")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 795ef788145ed2fa023efdf11e8d5d7bedc21462 upstream.
Don't populate the arrays cdb on the stack, instead make them static.
Makes the object code smaller by 230 bytes:
Before:
text data bss dec hex filename
3797 240 0 4037 fc5 drivers/ata/libata-zpodd.o
After:
text data bss dec hex filename
3407 400 0 3807 edf drivers/ata/libata-zpodd.o
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 136d769e0b3475d71350aa3648a116a6ee7a8f6c upstream.
While whitelisting Micron M500DC drives, the tweaked blacklist entry
enabled queued TRIM from M500IT variants also. But these do not support
queued TRIM. And while using those SSDs with the latest kernel we have
seen errors and even the partition table getting corrupted.
Some part from the dmesg:
[ 6.727384] ata1.00: ATA-9: Micron_M500IT_MTFDDAK060MBD, MU01, max UDMA/133
[ 6.727390] ata1.00: 117231408 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[ 6.741026] ata1.00: supports DRM functions and may not be fully accessible
[ 6.759887] ata1.00: configured for UDMA/133
[ 6.762256] scsi 0:0:0:0: Direct-Access ATA Micron_M500IT_MT MU01 PQ: 0 ANSI: 5
and then for the error:
[ 120.860334] ata1.00: exception Emask 0x1 SAct 0x7ffc0007 SErr 0x0 action 0x6 frozen
[ 120.860338] ata1.00: irq_stat 0x40000008
[ 120.860342] ata1.00: failed command: SEND FPDMA QUEUED
[ 120.860351] ata1.00: cmd 64/01:00:00:00:00/00:00:00:00:00/a0 tag 0 ncq dma 512 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x5 (timeout)
[ 120.860353] ata1.00: status: { DRDY }
[ 120.860543] ata1: hard resetting link
[ 121.166128] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 121.166376] ata1.00: supports DRM functions and may not be fully accessible
[ 121.186238] ata1.00: supports DRM functions and may not be fully accessible
[ 121.204445] ata1.00: configured for UDMA/133
[ 121.204454] ata1.00: device reported invalid CHS sector 0
[ 121.204541] sd 0:0:0:0: [sda] tag#18 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
[ 121.204546] sd 0:0:0:0: [sda] tag#18 Sense Key : 0x5 [current]
[ 121.204550] sd 0:0:0:0: [sda] tag#18 ASC=0x21 ASCQ=0x4
[ 121.204555] sd 0:0:0:0: [sda] tag#18 CDB: opcode=0x93 93 08 00 00 00 00 00 04 28 80 00 00 00 30 00 00
[ 121.204559] print_req_error: I/O error, dev sda, sector 272512
After few reboots with these errors, and the SSD is corrupted.
After blacklisting it, the errors are not seen and the SSD does not get
corrupted any more.
Fixes: 243918be6393 ("libata: Do not blacklist Micron M500DC")
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 322579dcc865b94b47345ad1b6002ad167f85405 upstream.
Sandisk SSDs SD7SN6S256G and SD8SN8U256G are regularly locking up
regularly under sustained moderate load with NCQ enabled. Blacklist
for now.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 184add2ca23ce5edcac0ab9c3b9be13f91e7b567 upstream.
Richard Jones has reported that using med_power_with_dipm on a T450s
with a Sandisk SD7UB3Q256G1001 SSD (firmware version X2180501) is
causing the machine to hang.
Switching the LPM to max_performance fixes this, so it seems that
this Sandisk SSD does not handle LPM well.
Note in the past there have been bug-reports about the following
Sandisk models not working with min_power, so we may need to extend
the quirk list in the future: name - firmware
Sandisk SD6SB2M512G1022I - X210400
Sandisk SD6PP4M-256G-1006 - A200906
Cc: stable@vger.kernel.org
Cc: Richard W.M. Jones <rjones@redhat.com>
Reported-and-tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit c034640a32f8456018d9c8c83799ead683046b95 ]
When platform_get_irq() fails, it returns an error code, which
libahci_platform and replaces it by -EINVAL. This commit fixes that by
propagating the error code. It fixes the situation where
platform_get_irq() returns -EPROBE_DEFER because the interrupt
controller is not available yet, and generally looks like the right
thing to do.
We pay attention to not show the "no irq" message when we are in an
EPROBE_DEFER situation, because the driver probing will be retried
later on, once the interrupt controller becomes available to provide
the interrupt.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d418ff56b8f2d2b296daafa8da151fe27689b757 upstream.
When commit 9c7be59fc519af ("libata: Apply NOLPM quirk to Crucial MX100
512GB SSDs") was added it inherited the ATA_HORKAGE_NO_NCQ_TRIM quirk
from the existing "Crucial_CT*MX100*" entry, but that entry sets model_rev
to "MU01", where as the entry adding the NOLPM quirk sets it to NULL.
This means that after this commit we no apply the NO_NCQ_TRIM quirk to
all "Crucial_CT512MX100*" SSDs even if they have the fixed "MU02"
firmware. This commit splits the "Crucial_CT512MX100*" quirk into 2
quirks, one for the "MU01" firmware and one for all other firmware
versions, so that we once again only apply the NO_NCQ_TRIM quirk to the
"MU01" firmware version.
Fixes: 9c7be59fc519af ("libata: Apply NOLPM quirk to ... MX100 512GB SSDs")
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3bf7b5d6d017c27e0d3b160aafb35a8e7cfeda1f upstream.
Commit b17e5729a630 ("libata: disable LPM for Crucial BX100 SSD 500GB
drive"), introduced a ATA_HORKAGE_NOLPM quirk for Crucial BX100 500GB SSDs
but limited this to the MU02 firmware version, according to:
http://www.crucial.com/usa/en/support-ssd-firmware
MU02 is the last version, so there are no newer possibly fixed versions
and if the MU02 version has broken LPM then the MU01 almost certainly
also has broken LPM, so this commit changes the quirk to apply to all
firmware versions.
Fixes: b17e5729a630 ("libata: disable LPM for Crucial BX100 SSD 500GB...")
Cc: stable@vger.kernel.org
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 62ac3f7305470e3f52f159de448bc1a771717e88 upstream.
There have been reports of the Crucial M500 480GB model not working
with LPM set to min_power / med_power_with_dipm level.
It has not been tested with medium_power, but that typically has no
measurable power-savings.
Note the reporters Crucial_CT480M500SSD3 has a firmware version of MU03
and there is a MU05 update available, but that update does not mention any
LPM fixes in its changelog, so the quirk matches all firmware versions.
In my experience the LPM problems with (older) Crucial SSDs seem to be
limited to higher capacity versions of the SSDs (different firmware?),
so this commit adds a NOLPM quirk for the 480 and 960GB versions of the
M500, to avoid LPM causing issues with these SSDs.
Cc: stable@vger.kernel.org
Reported-and-tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|