summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2009-11-26sched: Limit the number of scheduler debug messagesMike Travis
Remove the verbose scheduler debug messages unless kernel parameter "sched_debug" set. /proc/sched_debug unchanged. Signed-off-by: Mike Travis <travis@sgi.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Tejun Heo <tj@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091118002221.489305000@alcatraz.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-25sched.c: Call debug_show_all_locks() when dumping all tasksShmulik Ladkani
In commit v2.6.21-691-g39bc89f ("make SysRq-T show all tasks again") the interface of show_state_filter() was changed: zero valued 'state_filter' specifies "dump all tasks" (instead of -1). However, the condition for calling debug_show_all_locks() ("show locks if all tasks are dumped") was not updated accordingly. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Cc: peterz@infradead.org LKML-Reference: <4b0d2fe4.0ab6660a.6437.3cfc@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-24sched, x86: Optimize branch hint in __switch_to()Tim Blechmann
Branch hint profiling on my nehalem machine showed 96% incorrect branch hints: 6548732 174664120 96 __switch_to process_64.c 406 6548745 174565593 96 __switch_to process_64.c 410 Signed-off-by: Tim Blechmann <tim@klingt.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4B0BBB93.3080307@klingt.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-24sched: Optimize branch hint in context_switch()Tim Blechmann
Branch hint profiling on my nehalem machine showed over 90% incorrect branch hints: 10420275 170645395 94 context_switch sched.c 3043 10408421 171098521 94 context_switch sched.c 3050 Signed-off-by: Tim Blechmann <tim@klingt.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4B0BBB9F.6080304@klingt.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-24sched: Optimize branch hint in pick_next_task_fair()Tim Blechmann
Branch hint profiling on my nehalem machine showed 90% incorrect branch hints: 15728471 158903754 90 pick_next_task_fair sched_fair.c 1555 Signed-off-by: Tim Blechmann <tim@klingt.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4B0BBBB1.2050100@klingt.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23sched_feat_write(): Update ppos instead of file->f_posJan Blunck
sched_feat_write() should update ppos instead of file->f_pos. (This reduces some BKL dependencies of this code.) Signed-off-by: Jan Blunck <jblunck@suse.de> Cc: jkacur@redhat.com Cc: Arnd Bergmann <arnd@arndb.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jamie Lokier <jamie@shareable.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> LKML-Reference: <1258735245-25826-8-git-send-email-jblunck@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-13sched: More generic WAKE_AFFINE vs select_idle_sibling()Peter Zijlstra
Instead of only considering SD_WAKE_AFFINE | SD_PREFER_SIBLING domains also allow all SD_PREFER_SIBLING domains below a SD_WAKE_AFFINE domain to change the affinity target. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091112145610.909723612@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-13sched: Cleanup select_task_rq_fair()Peter Zijlstra
Clean up the new affine to idle sibling bits while trying to grok them. Should not have any function differences. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091112145610.832503781@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-12sched: Fix granularity of task_u/stime()Hidetoshi Seto
Originally task_s/utime() were designed to return clock_t but later changed to return cputime_t by following commit: commit efe567fc8281661524ffa75477a7c4ca9b466c63 Author: Christian Borntraeger <borntraeger@de.ibm.com> Date: Thu Aug 23 15:18:02 2007 +0200 It only changed the type of return value, but not the implementation. As the result the granularity of task_s/utime() is still that of clock_t, not that of cputime_t. So using task_s/utime() in __exit_signal() makes values accumulated to the signal struct to be rounded and coarse grained. This patch removes casts to clock_t in task_u/stime(), to keep granularity of cputime_t over the calculation. v2: Use div_u64() to avoid error "undefined reference to `__udivdi3`" on some 32bit systems. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: xiyou.wangcong@gmail.com Cc: Spencer Candland <spencer@bluehost.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> LKML-Reference: <4AFB9029.9000208@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10sched: Make sure task has correct sched_class after policy changePeter Zijlstra
From the code in rt_mutex_setprio(), it is evident that the intention is that task's with a RT 'prio' value as a consequence of receiving a PI boost also have their 'sched_class' field set to '&rt_sched_class'. However, Peter noticed that the code in __setscheduler() could result in this intention being frustrated. Fix it. Reported-by: Peter Williams <pwil3058@bigpond.net.au> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1257880321.4108.457.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10sched: Fix and clean up rate-limit newidle codeMike Galbraith
Commit 1b9508f, "Rate-limit newidle" has been confirmed to fix the netperf UDP loopback regression reported by Alex Shi. This is a cleanup and a fix: - moved to a more out of the way spot - fix to ensure that balancing doesn't try to balance runqueues which haven't gone online yet, which can mess up CPU enumeration during boot. Reported-by: Alex Shi <alex.shi@intel.com> Reported-by: Zhang, Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> # .32.x: a1f84a3: sched: Check for an idle shared cache Cc: <stable@kernel.org> # .32.x: 1b9508f: sched: Rate-limit newidle Cc: <stable@kernel.org> # .32.x: fd21073: sched: Fix affinity logic Cc: <stable@kernel.org> # .32.x LKML-Reference: <1257821402.5648.17.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08sched, no_hz: Remove unused rq->last_tick_seen fieldLai Jiangshan
In 15934a37324f32e0fda633dc7984a671ea81cd75, field last_tick_seen is added to struct rq. But it is unused now. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Guillaume Chazarain <guichaz@yahoo.fr> LKML-Reference: <4AE6A513.6010100@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-05sched: Fix affinity logic in select_task_rq_fair()Mike Galbraith
Ingo Molnar reported: [ 26.804000] BUG: using smp_processor_id() in preemptible [00000000] code: events/1/10 [ 26.808000] caller is vmstat_update+0x26/0x70 [ 26.812000] Pid: 10, comm: events/1 Not tainted 2.6.32-rc5 #6887 [ 26.816000] Call Trace: [ 26.820000] [<c1924a24>] ? printk+0x28/0x3c [ 26.824000] [<c13258a0>] debug_smp_processor_id+0xf0/0x110 [ 26.824000] mount used greatest stack depth: 1464 bytes left [ 26.828000] [<c111d086>] vmstat_update+0x26/0x70 [ 26.832000] [<c1086418>] worker_thread+0x188/0x310 [ 26.836000] [<c10863b7>] ? worker_thread+0x127/0x310 [ 26.840000] [<c108d310>] ? autoremove_wake_function+0x0/0x60 [ 26.844000] [<c1086290>] ? worker_thread+0x0/0x310 [ 26.848000] [<c108cf0c>] kthread+0x7c/0x90 [ 26.852000] [<c108ce90>] ? kthread+0x0/0x90 [ 26.856000] [<c100c0a7>] kernel_thread_helper+0x7/0x10 [ 26.860000] BUG: using smp_processor_id() in preemptible [00000000] code: events/1/10 [ 26.864000] caller is vmstat_update+0x3c/0x70 Because this commit: a1f84a3: sched: Check for an idle shared cache in select_task_rq_fair() broke ->cpus_allowed. Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: arjan@infradead.org Cc: <stable@kernel.org> LKML-Reference: <1257415066.12867.1.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04sched: Rate-limit newidleMike Galbraith
Rate limit newidle to migration_cost. It's a win for all stages of sysbench oltp tests. Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04sched: Check for an idle shared cache in select_task_rq_fair()Mike Galbraith
When waking affine, check for an idle shared cache, and if found, wake to that CPU/sibling instead of the waker's CPU. This improves pgsql+oltp ramp up by roughly 8%. Possibly more for other loads, depending on overlap. The trade-off is a roughly 1% peak downturn if tasks are truly synchronous. Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@kernel.org> LKML-Reference: <1256654138.17752.7.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04cpumask: Partition_sched_domains takes array of cpumask_var_tRusty Russell
Currently partition_sched_domains() takes a 'struct cpumask *doms_new' which is a kmalloc'ed array of cpumask_t. You can't have such an array if 'struct cpumask' is undefined, as we plan for CONFIG_CPUMASK_OFFSTACK=y. So, we make this an array of cpumask_var_t instead: this is the same for the CONFIG_CPUMASK_OFFSTACK=n case, but requires multiple allocations for the CONFIG_CPUMASK_OFFSTACK=y case. Hence we add alloc_sched_domains() and free_sched_domains() functions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <200911031453.40668.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04cpumask: Simplify sched_rt.cRusty Russell
find_lowest_rq() wants to call pick_optimal_cpu() on the intersection of sched_domain_span(sd) and lowest_mask. Rather than doing a cpus_and into a temporary, we can open-code it. This actually makes the code slightly clearer, IMHO. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Gregory Haskins <ghaskins@novell.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <200911031453.15350.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04sched: Add USER_SCHED to feature removal listDhaval Giani
Peter Zijlstra suggested that we remove USER_SCHED at: http://lkml.org/lkml/2009/3/21/67 Removing USER_SCHED removes a lot of code from the scheduler and simplifies the code. We already have the ability to do user based classification which is tightened using PAM in userspace. Schedule USER_SCHED for removal in 2.6.34 Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Bharata B Rao <bharata@linux.vnet.ibm.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> LKML-Reference: <20091103214544.GI5495@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04sched: Remove unused cpu_nr_migrations()Hiroshi Shimamoto
cpu_nr_migrations() is not used, remove it. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4AF12A66.6020609@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04sched: Remove unused time_sync_thresh declarationHiroshi Shimamoto
time_sync_thresh had been removed. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4AF12A3A.5050200@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04sched: Remove unused __schedule() declarationHiroshi Shimamoto
__schedule() had been removed. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4AF129C8.3030008@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-25sched, cpuacct: Fix niced guest time accountingRyota Ozaki
CPU time of a guest is always accounted in 'user' time without concern for the nice value of its counterpart process although the guest is scheduled under the nice value. This patch fixes the defect and accounts cpu time of a niced guest in 'nice' time as same as a niced process. And also the patch adds 'guest_nice' to cpuacct. The value provides niced guest cpu time which is like 'nice' to 'user'. The original discussions can be found here: http://www.mail-archive.com/kvm@vger.kernel.org/msg23982.html http://www.mail-archive.com/kvm@vger.kernel.org/msg23860.html Signed-off-by: Ryota Ozaki <ozaki.ryota@gmail.com> Acked-by: Avi Kivity <avi@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1256314810-7897-1-git-send-email-ozaki.ryota@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-25Merge branch 'linus' into sched/coreIngo Molnar
Conflicts: fs/proc/array.c Merge reason: resolve conflict and queue up dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linusLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: move virtrng_remove to .devexit.text move virtballoon_remove to .devexit.text virtio_blk: Revert serial number support virtio: let header files include virtio_ids.h virtio_blk: revert QUEUE_FLAG_VIRT addition
2009-10-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits) niu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was copied to the head buffer in the Vlan packets case KS8851: Fix ks8851_set_rx_mode() for IFF_MULTICAST KS8851: Fix MAC address write order KS8851: Add soft reset at probe time net: fix section mismatch in fec.c net: Fix struct inet_timewait_sock bitfield annotation tcp: Try to catch MSG_PEEK bug net: Fix IP_MULTICAST_IF bluetooth: static lock key fix bluetooth: scheduling while atomic bug fix tcp: fix TCP_DEFER_ACCEPT retrans calculation tcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPT tcp: accept socket after TCP_DEFER_ACCEPT period Revert "tcp: fix tcp_defer_accept to consider the timeout" AF_UNIX: Fix deadlock on connecting to shutdown socket ethoc: clear only pending irqs ethoc: inline regs access vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n virtio_net: use dev_kfree_skb_any() in free_old_xmit_skbs() be2net: fix support for PCI hot plug ...
2009-10-22move virtrng_remove to .devexit.textUwe Kleine-König
The function virtrng_remove is used only wrapped by __devexit_p so define it using __devexit. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Sam Ravnborg <sam@ravnborg.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S. Tsirkin <mst@redhat.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-22move virtballoon_remove to .devexit.textUwe Kleine-König
The function virtballoon_remove is used only wrapped by __devexit_p so define it using __devexit. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-22virtio_blk: Revert serial number supportRusty Russell
This reverts "Add serial number support for virtio_blk, V4a". Turns out that virtio_pci, lguest and s/390 all have an 8 bit limit on virtio config space, so noone could ever use this. This is coming back later in a cleaner form. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: john cooper <john.cooper@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com>
2009-10-22virtio: let header files include virtio_ids.hChristian Borntraeger
Rusty, commit 3ca4f5ca73057a617f9444a91022d7127041970a virtio: add virtio IDs file moved all device IDs into a single file. While the change itself is a very good one, it can break userspace applications. For example if a userspace tool wanted to get the ID of virtio_net it used to include virtio_net.h. This does no longer work, since virtio_net.h does not include virtio_ids.h. This patch moves all "#include <linux/virtio_ids.h>" from the C files into the header files, making the header files compatible with the old ones. In addition, this patch exports virtio_ids.h to userspace. CC: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-22virtio_blk: revert QUEUE_FLAG_VIRT additionChristoph Hellwig
It seems like the addition of QUEUE_FLAG_VIRT caueses major performance regressions for Fedora users: https://bugzilla.redhat.com/show_bug.cgi?id=509383 https://bugzilla.redhat.com/show_bug.cgi?id=505695 while I can't reproduce those extreme regressions myself I think the flag is wrong. Rationale: QUEUE_FLAG_VIRT expands to QUEUE_FLAG_NONROT which casus the queue unplugged immediately. This is not a good behaviour for at least qemu and kvm where we do have significant overhead for every I/O operations. Even with all the latested speeups (native AIO, MSI support, zero copy) we can only get native speed for up to 128kb I/O requests we already are down to 66% of native performance for 4kb requests even on my laptop running the Intel X25-M SSD for which the QUEUE_FLAG_NONROT was designed. If we ever get virtio-blk overhead low enough that this flag makes sense it should only be set based on a feature flag set by the host. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-21niu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was ↵Joyce Yu
copied to the head buffer in the Vlan packets case Signed-off-by: Joyce Yu <joyce.yu@sun.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-22Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notifyLinus Torvalds
* 'for-linus' of git://git.infradead.org/users/eparis/notify: dnotify: ignore FS_EVENT_ON_CHILD inotify: fix coalesce duplicate events into a single event in special case inotify: deprecate the inotify kernel interface fsnotify: do not set group for a mark before it is on the i_list
2009-10-22Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: hp_sdc_rtc - fix test in hp_sdc_rtc_read_rt() Input: atkbd - consolidate force release quirks for volume keys Input: logips2pp - model 73 is actually TrackMan FX Input: i8042 - add Sony Vaio VGN-FZ240E to the nomux list Input: fix locking issue in /proc/bus/input/ handlers Input: atkbd - postpone restoring LED/repeat rate at resume Input: atkbd - restore resetting LED state at startup Input: i8042 - make pnp_data_busted variable boolean instead of int Input: synaptics - add another Protege M300 to rate blacklist
2009-10-22Merge branch 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
* 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Prevent kvm_init from corrupting debugfs structures KVM: MMU: fix pointer cast KVM: use proper hrtimer function to retrieve expiration time
2009-10-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dmLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm snapshot: allow chunk size to be less than page size dm snapshot: use unsigned integer chunk size dm snapshot: lock snapshot while supplying status dm exception store: fix failed set_chunk_size error path dm snapshot: require non zero chunk size by end of ctr dm: dec_pending needs locking to save error value dm: add missing del_gendisk to alloc_dev error path dm log: userspace fix incorrect luid cast in userspace_ctr dm snapshot: free exception store on init failure dm snapshot: sort by chunk size to fix race
2009-10-22PM: Make warning in suspend_test_finish() less likely to happenRafael J. Wysocki
Increase TEST_SUSPEND_SECONDS to 10 so the warning in suspend_test_finish() doesn't annoy the users of slower systems so much. Also, make the warning print the suspend-resume cycle time, so that we know why the warning actually triggered. Patch prepared during the hacking session at the Kernel Summit in Tokyo. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-22mmc: at91_mci: Don't include asm/mach/mmc.hUwe Kleine-König
This fixes a compile bug introduced in 6ef297f (ARM: 5720/1: Move MMCI header to amba include dir) That commit moved arch/arm/include/asm/mach/mmc.h to include/linux/amba/mmci.h. Just removing the include was enough. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Linus Walleij <linus.walleij@stericsson.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Bill Gatliff <bgat@billgatliff.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Pierre Ossman <drzeus@drzeus.cx> Cc: linux-arm-kernel@lists.infradead.org Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-22Merge branch 'sh/for-2.6.32' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'sh/for-2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: sh: Kill off stray HAVE_FTRACE_SYSCALLS reference. sh: Remove BKL from landisk gio. sh: disabled cache handling fix. sh: Fix up single page flushing to use PAGE_SIZE.
2009-10-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: aesni-intel - Fix irq_fpu_usable usage crypto: padlock-sha - Fix stack alignment
2009-10-22nfs: Fix nfs_parse_mount_options() kfree() leakYinghai Lu
Fix a (small) memory leak in one of the error paths of the NFS mount options parsing code. Regression introduced in 2.6.30 by commit a67d18f (NFS: load the rpc/rdma transport module automatically). Reported-by: Yinghai Lu <yinghai@kernel.org> Reported-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-22fs: pipe.c null pointer dereferenceEarl Chew
This patch fixes a null pointer exception in pipe_rdwr_open() which generates the stack trace: > Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP: > [<ffffffff802899a5>] pipe_rdwr_open+0x35/0x70 > [<ffffffff8028125c>] __dentry_open+0x13c/0x230 > [<ffffffff8028143d>] do_filp_open+0x2d/0x40 > [<ffffffff802814aa>] do_sys_open+0x5a/0x100 > [<ffffffff8021faf3>] sysenter_do_call+0x1b/0x67 The failure mode is triggered by an attempt to open an anonymous pipe via /proc/pid/fd/* as exemplified by this script: ============================================================= while : ; do { echo y ; sleep 1 ; } | { while read ; do echo z$REPLY; done ; } & PID=$! OUT=$(ps -efl | grep 'sleep 1' | grep -v grep | { read PID REST ; echo $PID; } ) OUT="${OUT%% *}" DELAY=$((RANDOM * 1000 / 32768)) usleep $((DELAY * 1000 + RANDOM % 1000 )) echo n > /proc/$OUT/fd/1 # Trigger defect done ============================================================= Note that the failure window is quite small and I could only reliably reproduce the defect by inserting a small delay in pipe_rdwr_open(). For example: static int pipe_rdwr_open(struct inode *inode, struct file *filp) { msleep(100); mutex_lock(&inode->i_mutex); Although the defect was observed in pipe_rdwr_open(), I think it makes sense to replicate the change through all the pipe_*_open() functions. The core of the change is to verify that inode->i_pipe has not been released before attempting to manipulate it. If inode->i_pipe is no longer present, return ENOENT to indicate so. The comment about potentially using atomic_t for i_pipe->readers and i_pipe->writers has also been removed because it is no longer relevant in this context. The inode->i_mutex lock must be used so that inode->i_pipe can be dealt with correctly. Signed-off-by: Earl Chew <earl_chew@agilent.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-20KS8851: Fix ks8851_set_rx_mode() for IFF_MULTICASTBen Dooks
In ks8851_set_rx_mode() the case handling IFF_MULTICAST was also setting the RXCR1_AE bit by accident. This meant that all unicast frames where being accepted by the device. Remove RXCR1_AE from this case. Note, RXCR1_AE was also masking a problem with setting the MAC address properly, so needs to be applied after fixing the MAC write order. Fixes a bug reported by Doong, Ping of Micrel. This version of the patch avoids setting RXCR1_ME for all cases. Signed-off-by: Ben Dooks <ben@simtec.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20KS8851: Fix MAC address write orderBen Dooks
The MAC address register was being written in the wrong order, so add a new address macro to convert mac-address byte to register address and a ks8851_wrreg8() function to write each byte without having to worry about any difficult byte swapping. Fixes a bug reported by Doong, Ping of Micrel. Signed-off-by: Ben Dooks <ben@simtec.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20KS8851: Add soft reset at probe timeBen Dooks
Issue a full soft reset at probe time. This was reported by Doong Ping of Micrel, but no explanation of why this is necessary or what bug it is fixing. Add it as it does not seem to hurt the current driver and ensures that the device is in a known state when we start setting it up. Signed-off-by: Ben Dooks <ben@simtec.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20net: fix section mismatch in fec.cSteven King
fec_enet_init is called by both fec_probe and fec_resume, so it shouldn't be marked as __init. Signed-off-by: Steven King <sfking@fdwdc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20dnotify: ignore FS_EVENT_ON_CHILDAndreas Gruenbacher
Mask off FS_EVENT_ON_CHILD in dnotify_handle_event(). Otherwise, when there is more than one watch on a directory and dnotify_should_send_event() succeeds, events with FS_EVENT_ON_CHILD set will trigger all watches and cause spurious events. This case was overlooked in commit e42e2773. #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <signal.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <string.h> static void create_event(int s, siginfo_t* si, void* p) { printf("create\n"); } static void delete_event(int s, siginfo_t* si, void* p) { printf("delete\n"); } int main (void) { struct sigaction action; char *tmpdir, *file; int fd1, fd2; sigemptyset (&action.sa_mask); action.sa_flags = SA_SIGINFO; action.sa_sigaction = create_event; sigaction (SIGRTMIN + 0, &action, NULL); action.sa_sigaction = delete_event; sigaction (SIGRTMIN + 1, &action, NULL); # define TMPDIR "/tmp/test.XXXXXX" tmpdir = malloc(strlen(TMPDIR) + 1); strcpy(tmpdir, TMPDIR); mkdtemp(tmpdir); # define TMPFILE "/file" file = malloc(strlen(tmpdir) + strlen(TMPFILE) + 1); sprintf(file, "%s/%s", tmpdir, TMPFILE); fd1 = open (tmpdir, O_RDONLY); fcntl(fd1, F_SETSIG, SIGRTMIN); fcntl(fd1, F_NOTIFY, DN_MULTISHOT | DN_CREATE); fd2 = open (tmpdir, O_RDONLY); fcntl(fd2, F_SETSIG, SIGRTMIN + 1); fcntl(fd2, F_NOTIFY, DN_MULTISHOT | DN_DELETE); if (fork()) { /* This triggers a create event */ creat(file, 0600); /* This triggers a create and delete event (!) */ unlink(file); } else { sleep(1); rmdir(tmpdir); } return 0; } Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Eric Paris <eparis@redhat.com>
2009-10-20net: Fix struct inet_timewait_sock bitfield annotationEric Dumazet
commit 9e337b0f (net: annotate inet_timewait_sock bitfields) added 4/8 bytes in struct inet_timewait_sock. Fix this by declaring tw_ipv6_offset in the 'flags' bitfield The 14 bits hole is named tw_pad to make it cleary apparent. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20tcp: Try to catch MSG_PEEK bugHerbert Xu
This patch tries to print out more information when we hit the MSG_PEEK bug in tcp_recvmsg. It's been around since at least 2005 and it's about time that we finally fix it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20crypto: aesni-intel - Fix irq_fpu_usable usageHuang Ying
When renaming kernel_fpu_using to irq_fpu_usable, the semantics of the function is changed too, from mesuring whether kernel is using FPU, that is, the FPU is NOT available, to measuring whether FPU is usable, that is, the FPU is available. But the usage of irq_fpu_usable in aesni-intel_glue.c is not changed accordingly. This patch fixes this. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-10-19net: Fix IP_MULTICAST_IFEric Dumazet
ipv4/ipv6 setsockopt(IP_MULTICAST_IF) have dubious __dev_get_by_index() calls. This function should be called only with RTNL or dev_base_lock held, or reader could see a corrupt hash chain and eventually enter an endless loop. Fix is to call dev_get_by_index()/dev_put(). If this happens to be performance critical, we could define a new dev_exist_by_index() function to avoid touching dev refcount. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>