summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2008-03-10USB: fix ehci unlink regressionsDavid Brownell
The recent EHCI driver update to split the IAA watchdog timer out from the other timers made several things work better, but not everything; and it created a couple new issues in bugzilla. Ergo this patch: - Handle a should-be-rare SMP race between the watchdog firing and (very late) IAA interrupts; - Remove a shouldn't-have-been-added WARN_ON() test; - Guard against one observed OOPS; - If this watchdog fires during clean HC shutdown, it should act as a NOP instead of interfering with the shutdown sequence; - Guard against silicon errata hypothesized by some vendors: * IAA status latch broken, but IAAD cleared OK; * IAAD wasn't cleared when IAA status got reported; The WARN_ON is in bugzilla as 10168; the OOPS as 10078; these are both regressions. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Tested-by: Gordon Farquharson <gordonfarquharson@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-10USB: new ftdi_sio device idMirko Bordignon
Here is a patch that adds support for the propox jtagcable II dongle (http://www.propox.com/products/t_117.html): their PID was missing, therefore we were not able to have the device recognized though it uses a standard FTDI chip. Signed-off-by: Mirko Bordignon <mirko.bordignon@ieee.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-10USB: fsl_usb2_udc: fix broken KconfigLi Yang
The patch fixes broken Kconfig caused by the name change of MPC834x option. It also makes fsl_usb2_udc selectable on new platforms like MPC837x. Signed-off-by: Li Yang <leoli@freescale.com> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-10USB: option: add novatel device idsDirk DeSchepper
This updates the option driver with a lot more novatel driver ids. From: Dirk DeSchepper <ddeschepper@nvtl.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-10USB: cypress_m8: add UPS Powercom (0d9f:0002)Dmitry Shapin
Add support for UPS Powercom USB interface (0d9f:0002) in chip CY7C63723. In my case, this Powercom BNT800AP. Signed-off-by: Dmitry Shapin <shapin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-10USB: drivers/usb/storage/sddr55.c: fix uninitialized var warningsAndrew Morton
drivers/usb/storage/sddr55.c: In function 'sddr55_transport': drivers/usb/storage/sddr55.c:526: warning: 'deviceID' may be used uninitialized in this function drivers/usb/storage/sddr55.c:525: warning: 'manufacturerID' may be used uninitialized in this function Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-10USB: fix usb-serial generic recursive lockPete Zaitcev
Nobody should be using the generic usb-serial for anything other than testing. Still, it's not a good thing that it's easy to lock up. There is a traceback from NMI oopser here: https://bugzilla.redhat.com/show_bug.cgi?id=431379 But in short, if a line discipline has a chance to echo anything, input can loop back a write method. So, don't call tty_flip_buffer_push from under a lock taken on write path. Signed-off-by: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-07bluetooth: Add another Broadcom deviceKarsten Keil
This adds another Broadcom BCM2045 based device to the blacklist, with these settings the micro dongle works on my system. Signed-off-by: Karsten Keil <kkeil@suse.de> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-07ide: update references to Documentation/ide/ide.txt (v2)Randy Dunlap
Fix all references to Documentation/ide/ide.txt. Add/update ide/00-INDEX file. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-03-07ide: fix buggy code in ide_register_hw()Peter Teoh
Relocating the index to come after finding the hwif pointer. Signed-off-by: Peter Teoh <htmldeveloper@gmail.com> Reported-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-03-07ide: fix enabling DMA on it821x in "smart" modeBartlomiej Zolnierkiewicz
ide_tune_dma() should return '1' if IDE_HFLAG_NO_SET_MODE host flag is set. Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-03-07ide-cd: mark REQ_TYPE_ATA_PC write requests with REQ_RW flagBartlomiej Zolnierkiewicz
On Thursday 06 March 2008, walt wrote: > For me, this commit causes the problem it's intended to fix: > > commit 9f10d9ee0ac6d79d7bc8b9a158bf4a29322d84d3 > Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> > Date: Tue Feb 26 21:50:35 2008 +0100 > > ide-cd: fix 'ireason' handling for REQ_TYPE_ATA_PC requests > > This fixes some hangs caused by not finishing the transfer before ending > the request and also makes use of 'ireason == 1' quirk for spurious IRQs. > > When I mount a CD there is a long delay, and I see this error message: > > hdc: ide_cd_check_ireason: wrong transfer direction! > cdrom: failed setting lba address space > hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } > ide: failed opcode was: unknown > hdc: drive not ready for command > <repeated many times> > > When I revert this commit everything works properly again, including > CD burning. It turned out that REQ_TYPE_ATA_PC write requests were not marked as such (the previous commit assumed them to be). Reported-by: walt <w41ter@gmail.com> Tested-by: walt <w41ter@gmail.com> Reviewed-by: Borislav Petkov <petkovbb@googlemail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-03-07gigaset: fix Oops on module unload regressionTilman Schmidt
The card state mutex was only initialized when a device was connected, but used during unload unconditionally, leading to an Oops if a driver was loaded and unloaded again without ever connecting a device. Fix this by initializing the mutex as soon as the structure is allocated. Also add a missing mutex unlock revealed in the same execution path. This fixes a possible Oops in 2.6.25-rc that was introduced by commit e468c04894f36045cf93d1384183a461014b6840 ("Gigaset: permit module unload"). Thanks to Roland Kletzing for reporting this problem. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Tested-by: Roland Kletzing <devzero@web.de> Cc: Hansjoerg Lipp <hjlipp@web.de> Cc: Karsten Keil <kkeil@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-07drivers/char/esp.c: fix bootup lockupIngo Molnar
randconfig testing found a bootup lockup in drivers/char/esp.c because of a spinlock that wasn't correctly initialized. I'm not sure why it became more prominent in 2.6.25-rc4, the bug seems rather old and i've been doing allyesconfig bootups for ages with CONFIG_ESP enabled. This fixes this bootup lockup: PM: Adding info for No Bus:ttyP63 ttyP32 at 0x0240 (irq = 0) is an ESP primary port BUG: spinlock lockup on CPU#0, swapper/1, f56dd004 Pid: 1, comm: swapper Not tainted 2.6.25-rc4-sched-devel.git-x86-latest.git #402 [<c03ac6f4>] _raw_spin_lock+0x134/0x140 [<c08649be>] _spin_lock_irqsave+0x5e/0x80 [<c0b9fbfe>] ? espserial_init+0x2be/0x6e0 [<c0b9fbfe>] espserial_init+0x2be/0x6e0 [<c0b877a3>] kernel_init+0x83/0x260 [<c0b9f940>] ? espserial_init+0x0/0x6e0 [<c010416a>] ? restore_nocheck_notrace+0x0/0xe [<c0b87720>] ? kernel_init+0x0/0x260 [<c0b87720>] ? kernel_init+0x0/0x260 [<c0104507>] kernel_thread_helper+0x7/0x10 ======================= kzalloc() is not the way to initialize spinlocks anymore. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6.25Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6.25: sh: Fix up the sh64 build. sh: Fix up SH7710 VoIP-GW build. sh: Flag PMB support as EXPERIMENTAL. sh: Update r7780mp defconfig. fb: hitfb: Balance probe/remove section annotations. sh: hp6xx: Fix up hp6xx_apm build failure. fb: pvr2fb: Fix up remaining section mismatch. sh: Fix up section mismatches. sh: hp6xx: Correct APM output. sh: update se7780 defconfig sh: replace remaining __FUNCTION__ occurrences sh: export copy-page() to modules sh_ksyms_32.c update for gcc 4.3 sh/mm/pg-sh7705.c must #include <linux/fs.h>
2008-03-06fb: hitfb: Balance probe/remove section annotations.Paul Mundt
hitfb presently has probe using __init whilst remove uses __devexit. As this device can't possibly be hotplugged, switch to __exit and __exit_p() instead. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-03-06fb: pvr2fb: Fix up remaining section mismatch.Paul Mundt
Building with CONFIG_DEBUG_SECTION_MISMATCH=y reports: CC drivers/video/pvr2fb.o LD drivers/video/built-in.o WARNING: drivers/video/built-in.o(.text+0xb9b0): Section mismatch in reference from the function pvr2fb_check_var() to the variable .devinit.data:pvr2_fix The function pvr2fb_check_var() references the variable __devinitdata pvr2_fix. This is often because pvr2fb_check_var lacks a __devinitdata annotation or the annotation of pvr2_fix is wrong. This is obviously crap as no such reference exists, but it's a bit closer to reality from older versions which blamed the PCI table. The real problem was a reference to pvr2_var.vmode from pvr2fb_check_var(), as pvr2_var is flagged as __devinitdata (pvr2_fix is also, so at least that part is right). pvr2_var.vmode is just a fancy way of saying FB_VMODE_NONINTERLACED, so we just reference that explicitly instead. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-03-05Merge branch 'for-linus' of git://git.infradead.org/~dedekind/ubi-2.6Linus Torvalds
* 'for-linus' of git://git.infradead.org/~dedekind/ubi-2.6: UBI: mtd/ubi/vtbl.c: fix memory leak UBI: fix sparse errors in ubi.h UBI: fix error message UBI: silence warning
2008-03-05parisc: fix IOMMU's device boundary overflow bug on 32bits archFUJITA Tomonori
On 32bits boxes, boundary_size becomes zero due to a overflow and we hit BUG_ON in iommu_is_span_boundary. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Kyle McMartin <kyle@parisc-linux.org> Cc: Matthew Wilcox <matthew@wil.cx> Acked-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (27 commits) [SCSI] mpt fusion: don't oops if NumPhys==0 [SCSI] iscsi class: regression - fix races with state manipulation and blocking/unblocking [SCSI] qla4xxx: regression - add start scan callout [SCSI] qla4xxx: fix host reset dpc race [SCSI] tgt: fix build errors when dprintk is defined [SCSI] tgt: set the data length properly [SCSI] tgt: stop zero'ing scsi_cmnd [SCSI] ibmvstgt: set up scsi_host properly before __scsi_alloc_queue [SCSI] docbook: fix fusion source files [SCSI] docbook: fix scsi source file [SCSI] qla2xxx: Update version number to 8.02.00-k9. [SCSI] qla2xxx: Correct usage of inconsistent timeout values while issuing ELS commands. [SCSI] qla2xxx: Correct discrepancies during OVERRUN handling on FWI2-capable cards. [SCSI] qla2xxx: Correct needless clean-up resets during shutdown. [SCSI] arcmsr: update version and changelog [SCSI] ps3rom: disable clustering [SCSI] ps3rom: fix wrong resid calculation bug [SCSI] mvsas: fix phy sas address [SCSI] gdth: fix to internal commands execution [SCSI] gdth: bugfix for the at-exit problems ...
2008-03-05Merge branch 'fixes-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes-25' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] fix section mismatch warnings [CPUFREQ] Remove debugging message from e_powersaver [CPUFREQ] Fix missing cpufreq_cpu_put() call in ->store [CPUFREQ] Fix missing cpufreq_cpu_put() call in ->show
2008-03-05Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6Linus Torvalds
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [S390] incorrect reipl nss name. [S390] Load disabled wait psw if reipl fails. [S390] Fix IPL from NSS. [S390] zcrypt: fix ap_device_list handling [S390] sclp_vt220: speed up console output for interactive work [S390] dasd: fix reference counting in display method for proc/dasd/devices [S390] dasd: let dasd erp matching recognize alias recovery [S390] Get rid of memcpy gcc warning workaround. [S390] idle: Fix machine check handling in idle loop. [S390] Update default configuration.
2008-03-05[SCSI] mpt fusion: don't oops if NumPhys==0Krzysztof Oledzki
Don't oops if NumPhys==0, instead return -ENODEV. This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=9909 Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl> Acked-by: Eric Moore <Eric.Moore@lsi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-03-05[CPUFREQ] fix section mismatch warningsSam Ravnborg
Fix the following warnings: WARNING: vmlinux.o(.text+0xfe6711): Section mismatch in reference from the function cpufreq_unregister_driver() to the variable .cpuinit.data:cpufreq_cpu_notifier WARNING: vmlinux.o(.text+0xfe68af): Section mismatch in reference from the function cpufreq_register_driver() to the variable .cpuinit.data:cpufreq_cpu_notifier WARNING: vmlinux.o(.exit.text+0xc4fa): Section mismatch in reference from the function cpufreq_stats_exit() to the variable .cpuinit.data:cpufreq_stat_cpu_notifier The warnings were casued by references to unregister_hotcpu_notifier() from normal functions or exit functions. This is flagged by modpost as a potential error because it does not know that for the non HOTPLUG_CPU scenario the unregister_hotcpu_notifier() is a nop. Silence the warning by replacing the __initdata annotation with a __refdata annotation. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Dave Jones <davej@codemonkey.org.uk>
2008-03-05[CPUFREQ] Fix missing cpufreq_cpu_put() call in ->storeDave Jones
refactor to use gotos instead of explicit exit paths Signed-off-by: Dave Jones <davej@redhat.com>
2008-03-05[CPUFREQ] Fix missing cpufreq_cpu_put() call in ->showDave Jones
refactor to use gotos instead of explicit exit paths Signed-off-by: Dave Jones <davej@redhat.com>
2008-03-05[SCSI] iscsi class: regression - fix races with state manipulation and ↵Mike Christie
blocking/unblocking For qla4xxx, we could be starting a session, but some error (network, target, IO from a device that got started, etc) could cause the session to fail and curring the block/unblock and state manipulation could race with each other. This patch just has those operations done in the single threaded iscsi eh work queue, so that way they are serialized. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-03-05[SCSI] qla4xxx: regression - add start scan calloutMike Christie
We are seeing EXIST errors from sysfs during device addition. We need a start scan callout so we do not start scanning sessions found during hba setup, before the async scsi scan code is ready. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: David C Somayajulu <david.somayajulu@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-03-05[SCSI] qla4xxx: fix host reset dpc raceMike Christie
The host reset callout could be starting to reset the hba at the same time the dpc thread is. This creates lots of problems because they both want to do wierd things with the firmware and interrupts, etc. This patch just has the host reset function fully shutdown the dpc thread before resetting the hba. This patch also moves the setting of the session online bit to fix a potential race with the dpc thread and iscsi recovery thread. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: David C Somayajulu <david.somayajulu@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-03-05ahci: work around ATI SB600 h/w quirkJeff Garzik
This addresses the recent ATI SB600 errata, where the hardware does not like 256-length PRD entries during FPDMA (aka NCQ). It hurts performance on SB600, but it is more important to get a correct patch eliminating the data corruption/lockups, and then later on tune for performance. We simply limit each command to a maximum of 255 sectors, on SB600. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-03-05pata_hpt*, pata_serverworks: fix UDMA maskingAlan Cox
When masking, mask out the modes that are unsupported not the ones that are supported. This makes life happier. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-03-05[S390] zcrypt: fix ap_device_list handlingRalph Wuerthner
In ap_device_probe() we can add the new ap device to the internal device list only if the device probe function successfully returns. Otherwise we might end up with an invalid device in the internal ap device list. Signed-off-by: Ralph Wuerthner <rwuerthn@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-03-05[S390] sclp_vt220: speed up console output for interactive workChristian Borntraeger
Currently an output buffer can wait up to HZ/2 until the buffer is flushed. The wait time is noticeable in interactive tools like mc. Change the value to HZ/20, which seems enough for interactive work. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-03-05[S390] dasd: fix reference counting in display method for proc/dasd/devicesStefan Weinhuber
Using the /proc/dasd/devices interface leaves the reference counter of alias devices in an inconsistent state. A process that tries to set such a device offline afterwards will hang. The dasd_devices_show function returns immediately for alias devices and this code path was missing a dasd_put_device call. Signed-off-by: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-03-05[S390] dasd: let dasd erp matching recognize alias recoveryStefan Weinhuber
When a request fails that was started on an alias device then the first recovery step is to retry it on the base device. If the recovery request fails again with the same symptoms, the next step should not be a simple retry, but should be a proper recovery based on sense data, etc. To do so, the dasd recovery functions need to recognize the alias recovery step in the erp chain by comparing the start devices. Signed-off-by: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-03-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits) [IPCONFIG]: The kernel gets no IP from some DHCP servers b43legacy: Fix module init message rndis_wlan: fix broken data copy libertas: compare the current command with response libertas: fix sanity check on sequence number in command response p54: fix eeprom parser length sanity checks p54: fix EEPROM structure endianness ssb: Add pcibios_enable_device() return value check rc80211-pid: fix rate adjustment [ESP]: Add select on AUTHENC [TCP]: Improve ipv4 established hash function. [NETPOLL]: Revert two bogus cleanups that broke netconsole. [PPPOL2TP]: Add missing sock_put() in pppol2tp_tunnel_closeall() Subject: [PPPOL2TP] add missing sock_put() in pppol2tp_recv_dequeue() [BLUETOOTH]: l2cap info_timer delete fix in hci_conn_del [NET]: Fix race in generic address resolution. iucv: fix build error on !SMP [TCP]: Must count fack_count also when skipping [TUN]: Fix RTNL-locking in tun/tap driver [SCTP]: Use proc_create to setup de->proc_fops. ...
2008-03-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6: debugfs: fix sparse warnings Driver core: Fix cleanup when failing device_add(). driver core: Remove dpm_sysfs_remove() from error path of device_add() PM: fix new mutex-locking bug in the PM core PM: Do not acquire device semaphores upfront during suspend kobject: properly initialize ksets sysfs: CONFIG_SYSFS_DEPRECATED fix driver core: fix up Kconfig text for CONFIG_SYSFS_DEPRECATED
2008-03-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pci-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pci-2.6: pci: hotplug: pciehp: fix error code path in hpc_power_off_slot PCI: Add DECLARE_PCI_DEVICE_TABLE macro PCI: fix up error messages for pci_bus registering PCI: fix section mismatch warning in pci_scan_child_bus PCI: consolidate duplicated MSI enable functions PCI: use dev_printk in quirk messages
2008-03-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: USB: ftdi_sio - really enable EM1010PC USB: remove incorrect struct class_device from the printer gadget USB: pxa2xx_udc: fix misuse of clock enable/disable calls USB: ftdi_sio: Workaround for broken Matrix Orbital serial port USB: Add support for AXESSTEL MV110H CDMA modem usb-storage: update earlier scatter-gather bug fix USB: isp116x: fix enumeration on boot USB: ehci: handle large bulk URBs correctly (again) USB: spruce up the device blacklist USB: fix comment of struct usb_interface USB: update Kconfig entry for USB_SUSPEND usb: Add support for the mos7820/7840-based B&B USB/RS485 converter to mos7840.c
2008-03-04input: add I2C to config since the driver makes several i2c*() callsRandy Dunlap
Add to help text that the Intel I2C ICH (i801) driver is also needed for this kernel. Add LEDS_CLASS to config since the driver makes les_classdev_*() calls: ERROR: "led_classdev_register" [drivers/input/misc/apanel.ko] undefined! ERROR: "__led_classdev_unregister" [drivers/input/misc/apanel.ko] undefined! Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04md: the md RAID10 resync thread could cause a md RAID10 array deadlockK.Tanaka
This message describes another issue about md RAID10 found by testing the 2.6.24 md RAID10 using new scsi fault injection framework. Abstract: When a scsi error results in disabling a disk during RAID10 recovery, the resync threads of md RAID10 could stall. This case, the raid array has already been broken and it may not matter. But I think stall is not preferable. If it occurs, even shutdown or reboot will fail because of resource busy. The deadlock mechanism: The r10bio_s structure has a "remaining" member to keep track of BIOs yet to be handled when recovering. The "remaining" counter is incremented when building a BIO in sync_request() and is decremented when finish a BIO in end_sync_write(). If building a BIO fails for some reasons in sync_request(), the "remaining" should be decremented if it has already been incremented. I found a case where this decrement is forgotten. This causes a md_do_sync() deadlock because md_do_sync() waits for md_done_sync() called by end_sync_write(), but end_sync_write() never calls md_done_sync() because of the "remaining" counter mismatch. For example, this problem would be reproduced in the following case: Personalities : [raid10] md0 : active raid10 sdf1[4] sde1[5](F) sdd1[2] sdc1[1] sdb1[6](F) 3919616 blocks 64K chunks 2 near-copies [4/2] [_UU_] [>....................] recovery = 2.2% (45376/1959808) finish=0.7min speed=45376K/sec This case, sdf1 is recovering, sdb1 and sde1 are disabled. An additional error with detaching sdd will cause a deadlock. md0 : active raid10 sdf1[4] sde1[5](F) sdd1[6](F) sdc1[1] sdb1[7](F) 3919616 blocks 64K chunks 2 near-copies [4/1] [_U__] [=>...................] recovery = 5.0% (99520/1959808) finish=5.9min speed=5237K/sec 2739 ? S< 0:17 [md0_raid10] 28608 ? D< 0:00 [md0_resync] 28629 pts/1 Ss 0:00 bash 28830 pts/1 R+ 0:00 ps ax 31819 ? D< 0:00 [kjournald] The resync thread keeps working, but actually it is deadlocked. Patch: By this patch, the remaining counter will be decremented if needed. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04md: fix possible raid1/raid10 deadlock on read error during resyncNeilBrown
Thanks to K.Tanaka and the scsi fault injection framework, here is a fix for another possible deadlock in raid1/raid10 error handing. If a read request returns an error while a resync is happening and a resync request is pending, the attempt to fix the error will block until the resync progresses, and the resync will block until the read request completes. Thus a deadlock. This patch fixes the problem. Cc: "K.Tanaka" <k-tanaka@ce.jp.nec.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04md: don't attempt read-balancing for raid10 'far' layoutsKeld Simonsen
This patch changes the disk to be read for layout "far > 1" to always be the disk with the lowest block address. Thus the chunks to be read will always be (for a fully functioning array) from the first band of stripes, and the raid will then work as a raid0 consisting of the first band of stripes. Some advantages: The fastest part which is the outer sectors of the disks involved will be used. The outer blocks of a disk may be as much as 100 % faster than the inner blocks. Average seek time will be smaller, as seeks will always be confined to the first part of the disks. Mixed disks with different performance characteristics will work better, as they will work as raid0, the sequential read rate will be number of disks involved times the IO rate of the slowest disk. If a disk is malfunctioning, the first disk which is working, and has the lowest block address for the logical block will be used. Signed-off-by: Keld Simonsen <keld@dkuug.dk> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04md: lock access to rdev attributes properlyNeilBrown
When we access attributes of an rdev (component device on an md array) through sysfs, we really need to lock the array against concurrent changes. We currently do that when we change an attribute, but not when we read an attribute. We need to lock when reading as well else rdev->mddev could become NULL while we are accessing it. So add appropriate locking (mddev_lock) to rdev_attr_show. rdev_size_store requires some extra care as well as it needs to unlock the mddev while scanning other mddevs for overlapping regions. We currently assume that rdev->mddev will still be unchanged after the scan, but that cannot be certain. So take a copy of rdev->mddev for use at the end of the function. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04md: make sure a reshape is started when device switches to read-writeNeilBrown
A resync/reshape/recovery thread will refuse to progress when the array is marked read-only. So whenever it mark it not read-only, it is important to wake up thread resync thread. There is one place we didn't do this. The problem manifests if the start_ro module parameters is set, and a raid5 array that is in the middle of a reshape (restripe) is started. The array will initially be semi-read-only (meaning it acts like it is readonly until the first write). So the reshape will not proceed. On the first write, the array will become read-write, but the reshape will not be started, and there is no event which will ever restart that thread. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04md: clean up irregularity with raid autodetectNeilBrown
When a raid1 array is stopped, all components currently get added to the list for auto-detection. However we should really only add components that were found by autodetection in the first place. So add a flag to record that information, and use it. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04md: guard against possible bad array geometry in v1 metadataNeilBrown
Make sure the data doesn't start before the end of the superblock when the superblock is at the start of the device. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04md: reduce CPU wastage on idle md array with a write-intent bitmapNeilBrown
On an md array with a write-intent bitmap, a thread wakes up every few seconds and scans the bitmap looking for work to do. If the array is idle, there will be no work to do, but a lot of scanning is done to discover this. So cache the fact that the bitmap is completely clean, and avoid scanning the whole bitmap when the cache is known to be clean. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04md: fix deadlock in md/raid1 and md/raid10 when handling a read errorNeilBrown
When handling a read error, we freeze the array to stop any other IO while attempting to over-write with correct data. This is done in the raid1d(raid10d) thread and must wait for all submitted IO to complete (except for requests that failed and are sitting in the retry queue - these are counted in ->nr_queue and will stay there during a freeze). However write requests need attention from raid1d as bitmap updates might be required. This can cause a deadlock as raid1 is waiting for requests to finish that themselves need attention from raid1d. So we create a new function 'flush_pending_writes' to give that attention, and call it in freeze_array to be sure that we aren't waiting on raid1d. Thanks to "K.Tanaka" <k-tanaka@ce.jp.nec.com> for finding and reporting this problem. Cc: "K.Tanaka" <k-tanaka@ce.jp.nec.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04iommu: parisc: make the IOMMUs respect the segment boundary limitsFUJITA Tomonori
Make PARISC's two IOMMU implementations not allocate a memory area spanning LLD's segment boundary. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Kyle McMartin <kyle@parisc-linux.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>