summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2011-05-19Merge branch 'irq-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: irq: Export functions to allow modular irq drivers genirq: Uninline and sanity check generic_handle_irq() genirq: Remove pointless ifdefs genirq: Make generic irq chip depend on CONFIG_GENERIC_IRQ_CHIP genirq: Add chip suspend and resume callbacks genirq: Implement a generic interrupt chip genirq: Support per-IRQ thread disabling. genirq: irq_desc: Document preflow_handler and affinity_hint genirq: Update DocBook comments genirq: Forgotten updates/deletions after removal of compat code
2011-05-19Merge branch 'core-locking-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: seqlock: Don't smp_rmb in seqlock reader spin loop watchdog, hung_task_timeout: Add Kconfig configurable default lockdep: Remove cmpxchg to update nr_chain_hlocks lockdep: Print a nicer description for simple irq lock inversions lockdep: Replace "Bad BFS generated tree" message with something less cryptic lockdep: Print a nicer description for irq inversion bugs lockdep: Print a nicer description for simple deadlocks lockdep: Print a nicer description for normal deadlocks lockdep: Print a nicer description for irq lock inversions
2011-05-19Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (34 commits) PM: Introduce generic prepare and complete callbacks for subsystems PM: Allow drivers to allocate memory from .prepare() callbacks safely PM: Remove CONFIG_PM_VERBOSE Revert "PM / Hibernate: Reduce autotuned default image size" PM / Hibernate: Add sysfs knob to control size of memory for drivers PM / Wakeup: Remove useless synchronize_rcu() call kmod: always provide usermodehelper_disable() PM / ACPI: Remove acpi_sleep=s4_nonvs PM / Wakeup: Fix build warning related to the "wakeup" sysfs file PM: Print a warning if firmware is requested when tasks are frozen PM / Runtime: Rework runtime PM handling during driver removal Freezer: Use SMP barriers PM / Suspend: Do not ignore error codes returned by suspend_enter() PM: Fix build issue in clock_ops.c for CONFIG_PM_RUNTIME unset PM: Revert "driver core: platform_bus: allow runtime override of dev_pm_ops" OMAP1 / PM: Use generic clock manipulation routines for runtime PM PM: Remove sysdev suspend, resume and shutdown operations PM / PowerPC: Use struct syscore_ops instead of sysdevs for PM PM / UNICORE32: Use struct syscore_ops instead of sysdevs for PM PM / AVR32: Use struct syscore_ops instead of sysdevs for PM ...
2011-05-19params.c: Use new strtobool function to process boolean inputsJonathan Cameron
Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19module: Use binary search in lookup_symbol()Alessio Igor Bogani
The function is_exported() with its helper function lookup_symbol() are used to verify if a provided symbol is effectively exported by the kernel or by the modules. Now that both have their symbols sorted we can replace a linear search with a binary search which provide a considerably speed-up. This work was supported by a hardware donation from the CE Linux Forum. Signed-off-by: Alessio Igor Bogani <abogani@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19module: Use the binary search for symbols resolutionAlessio Igor Bogani
Takes advantage of the order and locates symbols using binary search. This work was supported by a hardware donation from the CE Linux Forum. Signed-off-by: Alessio Igor Bogani <abogani@kernel.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Dirk Behme <dirk.behme@googlemail.com>
2011-05-19module: each_symbol_section instead of each_symbolRusty Russell
Instead of having a callback function for each symbol in the kernel, have a callback for each array of symbols. This eases the logic when we move to sorted symbols and binary search. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Alessio Igor Bogani <abogani@kernel.org>
2011-05-19module: split unset_section_ro_nx function.Jan Glauber
Split the unprotect function into a function per section to make the code more readable and add the missing static declaration. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19module: undo module RONX protection correctly.Jan Glauber
While debugging I stumbled over two problems in the code that protects module pages. First issue is that disabling the protection before freeing init or unload of a module is not symmetric with the enablement. For instance, if pages are set to RO the page range from module_core to module_core + core_ro_size is protected. If a module is unloaded the page range from module_core to module_core + core_size is set back to RW. So pages that were not set to RO are also changed to RW. This is not critical but IMHO it should be symmetric. Second issue is that while set_memory_rw & set_memory_ro are used for RO/RW changes only set_memory_nx is involved for NX/X. One would await that the inverse function is called when the NX protection should be removed, which is not the case here, unless I'm missing something. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19module: zero mod->init_ro_size after init is freed.Jan Glauber
Reset mod->init_ro_size to zero after the init part of a module is unloaded. Otherwise we need to check if module->init is NULL in the unprotect functions in the next patch. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19minor ANSI prototype sparse fixDaniel J Blueman
Fix function prototype to be ANSI-C compliant, consistent with other function prototypes, addressing a sparse warning. Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19module: deal with alignment issues in built-in module versionsDmitry Torokhov
On m68k natural alignment is 2-byte boundary but we are trying to align structures in __modver section on sizeof(void *) boundary. This causes trouble when we try to access elements in this section in array-like fashion when create "version" attributes for built-in modules. Moreover, as DaveM said, we can't reliably put structures into independent objects, put them into a special section, and then expect array access over them (via the section boundaries) after linking the objects together to just "work" due to variable alignment choices in different situations. The only solution that seems to work reliably is to make an array of plain pointers to the objects in question and put those pointers in the special section. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Dmitry Torokhov <dtor@vmware.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-18irq: Export functions to allow modular irq driversJonathan Cameron
Export handle_simple_irq, irq_modify_status, irq_alloc_descs, irq_free_descs and generic_handle_irq to allow their usage in modules. First user is IIO, which wants to be built modular, but needs to be able to create irq chips, allocate and configure interrupt descriptors and handle demultiplexing interrupts. [ tglx: Moved the uninlinig of generic_handle_irq to a separate patch ] Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk> Link: http://lkml.kernel.org/r/%3C1305711544-505-1-git-send-email-jic23%40cam.ac.uk%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-18genirq: Uninline and sanity check generic_handle_irq()Thomas Gleixner
generic_handle_irq() is missing a NULL pointer check for the result of irq_to_desc. This was a not a big problem, but we want to expose it to drivers, so we better have sanity checks in place. Add a return value as well, which indicates that the irq number was valid and the handler was invoked. Based on the pure code move from Jonathan Cameron. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jonathan Cameron <jic23@cam.ac.uk>
2011-05-18genirq: Remove pointless ifdefsThomas Gleixner
kernel/irq/ is only built when CONFIG_GENERIC_HARDIRQS=y. So making code inside of kernel/irq/ conditional on CONFIG_GENERIC_HARDIRQS is pointless. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-17PM: Allow drivers to allocate memory from .prepare() callbacks safelyRafael J. Wysocki
If device drivers allocate substantial amounts of memory (above 1 MB) in their hibernate .freeze() callbacks (or in their legacy suspend callbcks during hibernation), the subsequent creation of hibernate image may fail due to the lack of memory. This is the case, because the drivers' .freeze() callbacks are executed after the hibernate memory preallocation has been carried out and the preallocated amount of memory may be too small to cover the new driver allocations. Unfortunately, the drivers' .prepare() callbacks also are executed after the hibernate memory preallocation has completed, so they are not suitable for allocating additional memory either. Thus the only way a driver can safely allocate memory during hibernation is to use a hibernate/suspend notifier. However, the notifiers are called before the freezing of user space and the drivers wanting to use them for allocating additional memory may not know how much memory needs to be allocated at that point. To let device drivers overcome this difficulty rework the hibernation sequence so that the memory preallocation is carried out after the drivers' .prepare() callbacks have been executed, so that the .prepare() callbacks can be used for allocating additional memory to be used by the drivers' .freeze() callbacks. Update documentation to match the new behavior of the code. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-05-17PM: Remove CONFIG_PM_VERBOSERafael J. Wysocki
Now that we have CONFIG_DYNAMIC_DEBUG there is no need for yet another flag causing dev_dbg() and pr_debug() statements in the core PM code to produce output. Moreover, CONFIG_PM_VERBOSE causes so much output to be generated that it's not really useful and almost no one sets it. References: https://bugzilla.kernel.org/show_bug.cgi?id=23182 Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-05-17Merge branch 'power-domains' into for-linusRafael J. Wysocki
* power-domains: PM: Fix build issue in clock_ops.c for CONFIG_PM_RUNTIME unset PM: Revert "driver core: platform_bus: allow runtime override of dev_pm_ops" OMAP1 / PM: Use generic clock manipulation routines for runtime PM PM / Runtime: Generic clock manipulation rountines for runtime PM (v6) PM / Runtime: Add subsystem data field to struct dev_pm_info OMAP2+ / PM: move runtime PM implementation to use device power domains PM / Platform: Use generic runtime PM callbacks directly shmobile: Use power domains for platform runtime PM PM: Export platform bus type's default PM callbacks PM: Make power domain callbacks take precedence over subsystem ones
2011-05-17Merge branch 'syscore' into for-linusRafael J. Wysocki
* syscore: PM: Remove sysdev suspend, resume and shutdown operations PM / PowerPC: Use struct syscore_ops instead of sysdevs for PM PM / UNICORE32: Use struct syscore_ops instead of sysdevs for PM PM / AVR32: Use struct syscore_ops instead of sysdevs for PM PM / Blackfin: Use struct syscore_ops instead of sysdevs for PM ARM / Samsung: Use struct syscore_ops for "core" power management ARM / PXA: Use struct syscore_ops for "core" power management ARM / SA1100: Use struct syscore_ops for "core" power management ARM / Integrator: Use struct syscore_ops for core PM ARM / OMAP: Use struct syscore_ops for "core" power management ARM: Use struct syscore_ops instead of sysdevs for PM in common code
2011-05-17Revert "PM / Hibernate: Reduce autotuned default image size"Rafael J. Wysocki
This reverts commit bea3864fb627d110933cfb8babe048b63c4fc76e (PM / Hibernate: Reduce autotuned default image size), because users are now able to resolve the issue this commit was supposed to address in a different way (i.e. by using the new /sys/power/reserved_size interface). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-05-17PM / Hibernate: Add sysfs knob to control size of memory for driversRafael J. Wysocki
Martin reports that on his system hibernation occasionally fails due to the lack of memory, because the radeon driver apparently allocates too much of it during the device freeze stage. It turns out that the amount of memory allocated by radeon during hibernation (and presumably during system suspend too) depends on the utilization of the GPU (e.g. hibernating while there are two KDE 4 sessions with compositing enabled causes radeon to allocate more memory than for one KDE 4 session). In principle it should be possible to use image_size to make the memory preallocation mechanism free enough memory for the radeon driver, but in practice it is not easy to guess the right value because of the way the preallocation code uses image_size. For this reason, it seems reasonable to allow users to control the amount of memory reserved for driver allocations made after the hibernate preallocation, which currently is constant and amounts to 1 MB. Introduce a new sysfs file, /sys/power/reserved_size, whose value will be used as the amount of memory to reserve for the post-preallocation reservations made by device drivers, in bytes. For backwards compatibility, set its default (and initial) value to the currently used number (1 MB). References: https://bugzilla.kernel.org/show_bug.cgi?id=34102 Reported-and-tested-by: Martin Steigerwald <Martin@Lichtvoll.de> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-05-17kmod: always provide usermodehelper_disable()Kay Sievers
We need to prevent kernel-forked processes during system poweroff. Such processes try to access the filesystem whose disks we are trying to shutdown at the same time. This causes delays and exceptions in the storage drivers. A follow-up patch will add these calls and need usermodehelper_disable() also on systems without suspend support. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-05-17PM: Print a warning if firmware is requested when tasks are frozenRafael J. Wysocki
Some drivers erroneously use request_firmware() from their ->resume() (or ->thaw(), or ->restore()) callbacks, which is not going to work unless the firmware has been built in. This causes system resume to stall until the firmware-loading timeout expires, which makes users think that the resume has failed and reboot their machines unnecessarily. For this reason, make _request_firmware() print a warning and return immediately with error code if it has been called when tasks are frozen and it's impossible to start any new usermode helpers. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Reviewed-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
2011-05-17Freezer: Use SMP barriersMike Frysinger
The freezer processes are dealing with multiple threads running simultaneously, and on a UP system, the memory reads/writes do not need barriers to keep things in sync. These are only needed on SMP systems, so use SMP barriers instead. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-05-17PM / Suspend: Do not ignore error codes returned by suspend_enter()MyungJoo Ham
The current implementation of suspend-to-RAM returns 0 if there is an error from suspend_enter(), because suspend_devices_and_enter() ignores the return value from suspend_enter(). This patch addresses this issue and properly keep the error return from suspend_enter() and let suspend_devices_and_enter relay the error return. Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-05-17Merge branch 'timers-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tick: Clear broadcast active bit when switching to oneshot rtc: mc13xxx: Don't call rtc_device_register while holding lock rtc: rp5c01: Initialize drvdata before registering device rtc: pcap: Initialize drvdata before registering device rtc: msm6242: Initialize drvdata before registering device rtc: max8998: Initialize drvdata before registering device rtc: max8925: Initialize drvdata before registering device rtc: m41t80: Initialize clientdata before registering device rtc: ds1286: Initialize drvdata before registering device rtc: ep93xx: Initialize drvdata before registering device rtc: davinci: Initialize drvdata before registering device rtc: mxc: Initialize drvdata before registering device clocksource: Install completely before selecting
2011-05-16tick: Clear broadcast active bit when switching to oneshotThomas Gleixner
The first cpu which switches from periodic to oneshot mode switches also the broadcast device into oneshot mode. The broadcast device serves as a backup for per cpu timers which stop in deeper C-states. To avoid starvation of the cpus which might be in idle and depend on broadcast mode it marks the other cpus as broadcast active and sets the brodcast expiry value of those cpus to the next tick. The oneshot mode broadcast bit for the other cpus is sticky and gets only cleared when those cpus exit idle. If a cpu was not idle while the bit got set in consequence the bit prevents that the broadcast device is armed on behalf of that cpu when it enters idle for the first time after it switched to oneshot mode. In most cases that goes unnoticed as one of the other cpus has usually a timer pending which keeps the broadcast device armed with a short timeout. Now if the only cpu which has a short timer active has the bit set then the broadcast device will not be armed on behalf of that cpu and will fire way after the expected timer expiry. In the case of Christians bug report it took ~145 seconds which is about half of the wrap around time of HPET (the limit for that device) due to the fact that all other cpus had no timers armed which expired before the 145 seconds timeframe. The solution is simply to clear the broadcast active bit unconditionally when a cpu switches to oneshot mode after the first cpu switched the broadcast device over. It's not idle at that point otherwise it would not be executing that code. [ I fundamentally hate that broadcast crap. Why the heck thought some folks that when going into deep idle it's a brilliant concept to switch off the last device which brings the cpu back from that state? ] Thanks to Christian for providing all the valuable debug information! Reported-and-tested-by: Christian Hoffmann <email@christianhoffmann.info> Cc: John Stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1105161105170.3078%40ionos%3E Cc: stable@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-13Cache user_ns in struct credSerge E. Hallyn
If !CONFIG_USERNS, have current_user_ns() defined to (&init_user_ns). Get rid of _current_user_ns. This requires nsown_capable() to be defined in capability.c rather than as static inline in capability.h, so do that. Request_key needs init_user_ns defined at current_user_ns if !CONFIG_USERNS, so forward-declare that in cred.h if !CONFIG_USERNS at current_user_ns() define. Compile-tested with and without CONFIG_USERNS. Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> [ This makes a huge performance difference for acl_permission_check(), up to 30%. And that is one of the hottest kernel functions for loads that are pathname-lookup heavy. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-11PM: Remove sysdev suspend, resume and shutdown operationsRafael J. Wysocki
Since suspend, resume and shutdown operations in struct sysdev_class and struct sysdev_driver are not used any more, remove them. Also drop sysdev_suspend(), sysdev_resume() and sysdev_shutdown() used for executing those operations and modify all of their users accordingly. This reduces kernel code size quite a bit and reduces its complexity. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-05-11PM / Hibernate: Fix ioctl SNAPSHOT_S2RAMRafael J. Wysocki
The SNAPSHOT_S2RAM ioctl used for implementing the feature allowing one to suspend to RAM after creating a hibernation image is currently broken, because it doesn't clear the "ready" flag in the struct snapshot_data object handled by it. As a result, the SNAPSHOT_UNFREEZE doesn't work correctly after SNAPSHOT_S2RAM has returned and the user space hibernate task cannot thaw the other processes as appropriate. Make SNAPSHOT_S2RAM clear data->ready to fix this problem. Tested-by: Alexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org
2011-05-11PM / Hibernate: Make snapshot_release() restore GFP maskRafael J. Wysocki
If the process using the hibernate user space interface closes /dev/snapshot after creating a hibernation image without thawing tasks, snapshot_release() should call pm_restore_gfp_mask() to restore the GFP mask used before the creation of the image. Make that happen. Tested-by: Alexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org
2011-05-11PM: Fix warning in pm_restrict_gfp_mask() during SNAPSHOT_S2RAM ioctlRafael J. Wysocki
A warning is printed by pm_restrict_gfp_mask() while the SNAPSHOT_S2RAM ioctl is being executed after creating a hibernation image, because pm_restrict_gfp_mask() has been called once already before the image creation and suspend_devices_and_enter() calls it once again. This happens after commit 452aa6999e6703ffbddd7f6ea124d3 (mm/pm: force GFP_NOIO during suspend/hibernation and resume). To avoid this issue, move pm_restrict_gfp_mask() and pm_restore_gfp_mask() from suspend_devices_and_enter() to its caller in kernel/power/suspend.c. Reported-by: Alexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org
2011-05-07Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf tools: Makefile: Use gcc to determine ARCH perf events, x86: Fix Intel Nehalem and Westmere last level cache event definitions hw_breakpoints, powerpc: Fix CONFIG_HAVE_HW_BREAKPOINT off-case in ptrace_set_debugreg() sh, hw_breakpoints: Fix racy access to ptrace breakpoints arm, hw_breakpoints: Fix racy access to ptrace breakpoints powerpc, hw_breakpoints: Fix racy access to ptrace breakpoints x86, hw_breakpoints: Fix racy access to ptrace breakpoints ptrace: Prepare to fix racy accesses on task breakpoints
2011-05-06Regression: partial revert "tracing: Remove lock_depth from event entry"Arjan van de Ven
This partially reverts commit e6e1e2593592a8f6f6380496655d8c6f67431266. That commit changed the structure layout of the trace structure, which in turn broke PowerTOP (1.9x generation) quite badly. I appreciate not wanting to expose the variable in question, and PowerTOP was not using it, so I've replaced the variable with just a padding field - that way if in the future a new field is needed it can just use this padding field. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-06Merge branch 'master' of ↵Ingo Molnar
ssh://master.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into perf/urgent
2011-05-05clocksource: Install completely before selectingjohn stultz
Christian Hoffmann reported that the command line clocksource override with acpi_pm timer fails: Kernel command line: <SNIP> clocksource=acpi_pm hpet clockevent registered Switching to clocksource hpet Override clocksource acpi_pm is not HRT compatible. Cannot switch while in HRT/NOHZ mode. The watchdog code is what enables CLOCK_SOURCE_VALID_FOR_HRES, but we actually end up selecting the clocksource before we enqueue it into the watchdog list, so that's why we see the warning and fail to switch to acpi_pm timer as requested. That's particularly bad when we want to debug timekeeping related problems in early boot. Put the selection call last. Reported-by: Christian Hoffmann <email@christianhoffmann.info> Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: stable@kernel.org # 32... Link: http://lkml.kernel.org/r/%3C1304558210.2943.24.camel%40work-vm%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-04Merge branch 'perf/urgent' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent
2011-05-02genirq: Fix typo CONFIG_GENIRC_IRQ_SHOW_LEVELGeert Uytterhoeven
commit ab7798ffcf98b11a9525cf65bacdae3fd58d357f ("genirq: Expand generic show_interrupts()") added the Kconfig option GENERIC_IRQ_SHOW_LEVEL to accomodate PowerPC, but this doesn't actually enable the functionality due to a typo in the #ifdef check. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Linux/PPC Development <linuxppc-dev@lists.ozlabs.org> Link: http://lkml.kernel.org/r/%3Calpine.DEB.2.00.1104302251370.19068%40ayla.of.borg%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-02genirq: Make generic irq chip depend on CONFIG_GENERIC_IRQ_CHIPThomas Gleixner
Only compile it in when there are users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arm-kernel@lists.infradead.org
2011-04-30Merge branch 'fixes-2.6.39' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq * 'fixes-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: fix deadlock in worker_maybe_bind_and_lock() workqueue: Document debugging tricks Fix up trivial spelling conflict in kernel/workqueue.c
2011-04-30PM / Runtime: Generic clock manipulation rountines for runtime PM (v6)Rafael J. Wysocki
Many different platforms and subsystems may want to disable device clocks during suspend and enable them during resume which is going to be done in a very similar way in all those cases. For this reason, provide generic routines for the manipulation of device clocks during suspend and resume. Convert the ARM shmobile platform to using the new routines. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-04-29Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf, x86, nmi: Move LVT un-masking into irq handlers perf events, x86: Work around the Nehalem AAJ80 erratum perf, x86: Fix BTS condition ftrace: Build without frame pointers on Microblaze
2011-04-29Merge branch 'timer-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: hrtimer: Initialize CLOCK_ID to HRTIMER_BASE table statically rtc: max8925: Call dev_set_drvdata before rtc_device_register
2011-04-29workqueue: fix deadlock in worker_maybe_bind_and_lock()Tejun Heo
If a rescuer and stop_machine() bringing down a CPU race with each other, they may deadlock on non-preemptive kernel. The CPU won't accept a new task, so the rescuer can't migrate to the target CPU, while stop_machine() can't proceed because the rescuer is holding one of the CPU retrying migration. GCWQ_DISASSOCIATED is never cleared and worker_maybe_bind_and_lock() retries indefinitely. This problem can be reproduced semi reliably while the system is entering suspend. http://thread.gmane.org/gmane.linux.kernel/1122051 A lot of kudos to Thilo-Alexander for reporting this tricky issue and painstaking testing. stable: This affects all kernels with cmwq, so all kernels since and including v2.6.36 need this fix. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Thilo-Alexander Ginkel <thilo@ginkel.com> Tested-by: Thilo-Alexander Ginkel <thilo@ginkel.com> Cc: stable@kernel.org
2011-04-29hrtimer: Initialize CLOCK_ID to HRTIMER_BASE table staticallyThomas Gleixner
Sedat and Bruno reported RCU stalls which turned out to be caused by the following; sched_init() calls init_rt_bandwidth() which calls hrtimer_init() _BEFORE_ hrtimers_init() is called. While not entirely correct this worked because hrtimer_init() only accessed statically initialized data (hrtimer_bases.clock_base[CLOCK_MONOTONIC]) Commit e06383db9 (hrtimers: extend hrtimer base code to handle more then 2 clockids) added an indirection to the hrtimer_bases.clock_base lookup to avoid gap handling in the hot path. The table which is used for the translataion from CLOCK_ID to HRTIMER_BASE index is initialized at runtime in hrtimers_init(). So the early call of the scheduler code translates CLOCK_MONOTONIC to HRTIMER_BASE_REALTIME. Thus the rt_bandwith timer ends up on CLOCK_REALTIME. If the timer is armed and the wall clock time is set (e.g. ntpdate in the early boot process - which also gives the problem deterministic behaviour i.e. magic recovery after N hours), then the timer ends up with an expiry time far into the future. That breaks the RT throttler mechanism as rt runtime is accumulated and never cleared, so the rt throttler detects a false cpu hog condition and blocks all RT tasks until the timer finally expires. That in turn stalls the RCU thread of TINYRCU which leads to an huge amount of RCU callbacks piling up. Make the translation table statically initialized, so we are back to the status of <= 2.6.39. Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reported-by: Bruno Prémont <bonbons@linux-vserver.org> Cc: John stultz <johnstul@us.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1104282353140.3005%40ionos%3E Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-04-28kernel/watchdog.c: disable nmi perf event in the error path of enabling watchdogHillf Danton
In corner cases where softlockup watchdog is not setup successfully, the relevant nmi perf event for hardlockup watchdog could be disabled, then the status of the underlying hardware remains unchanged. Also, if the kthread doesn't start then the hrtimer won't run and the hardlockup detector will falsely fire. Signed-off-by: Hillf Danton <dhillf@gmail.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-28watchdog, hung_task_timeout: Add Kconfig configurable defaultJeff Mahoney
This patch allows the default value for sysctl_hung_task_timeout_secs to be set at build time. The feature carries virtually no overhead, so it makes sense to keep it enabled. On heavily loaded systems, though, it can end up triggering stack traces when there is no bug other than the system being underprovisioned. We use this patch to keep the hung task facility available but disabled at boot-time. The default of 120 seconds is preserved. As a note, commit e162b39a may have accidentally reverted commit fb822db4, which raised the default from 120 seconds to 480 seconds. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Acked-by: Mandeep Singh Baines <msb@google.com> Link: http://lkml.kernel.org/r/4DB8600C.8080000@suse.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-27Merge branch 'tip/perf/urgent' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent
2011-04-25ptrace: Prepare to fix racy accesses on task breakpointsFrederic Weisbecker
When a task is traced and is in a stopped state, the tracer may execute a ptrace request to examine the tracee state and get its task struct. Right after, the tracee can be killed and thus its breakpoints released. This can happen concurrently when the tracer is in the middle of reading or modifying these breakpoints, leading to dereferencing a freed pointer. Hence, to prepare the fix, create a generic breakpoint reference holding API. When a reference on the breakpoints of a task is held, the breakpoints won't be released until the last reference is dropped. After that, no more ptrace request on the task's breakpoints can be serviced for the tracer. Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Will Deacon <will.deacon@arm.com> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: v2.6.33.. <stable@kernel.org> Link: http://lkml.kernel.org/r/1302284067-7860-2-git-send-email-fweisbec@gmail.com
2011-04-23Merge branch 'pm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PM: Add missing syscore_suspend() and syscore_resume() calls PM: Fix error code paths executed after failing syscore_suspend()