diff options
author | Ingo Molnar <mingo@kernel.org> | 2013-04-10 12:55:49 +0200 |
---|---|---|
committer | Ingo Molnar <mingo@kernel.org> | 2013-04-10 12:55:49 +0200 |
commit | 8fcfae31719c0a6c03f2cf63f815b46d378d8be4 (patch) | |
tree | 3da9d65885de6a2b046fbd5eebc0d19def0c1e2c /Documentation | |
parent | d02a9a89db3437467de45a451739e520877f4a48 (diff) | |
parent | 6d87669357936bffa1e8fea7a4e7743e76905736 (diff) |
Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:
* Remove restrictions on no-CBs CPUs, make RCU_FAST_NO_HZ
take advantage of numbered callbacks, do additional callback
accelerations based on numbered callbacks. Posted to LKML
at https://lkml.org/lkml/2013/3/18/960.
* RCU documentation updates. Posted to LKML at
https://lkml.org/lkml/2013/3/18/570.
* Miscellaneous fixes. Posted to LKML at
https://lkml.org/lkml/2013/3/18/594.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/RCU/checklist.txt | 26 | ||||
-rw-r--r-- | Documentation/RCU/lockdep.txt | 5 | ||||
-rw-r--r-- | Documentation/RCU/rcubarrier.txt | 15 | ||||
-rw-r--r-- | Documentation/RCU/stallwarn.txt | 33 | ||||
-rw-r--r-- | Documentation/RCU/whatisRCU.txt | 4 | ||||
-rw-r--r-- | Documentation/kernel-parameters.txt | 35 |
6 files changed, 85 insertions, 33 deletions
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index 31ef8fe07f82..79e789b8b8ea 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt @@ -217,9 +217,14 @@ over a rather long period of time, but improvements are always welcome! whether the increased speed is worth it. 8. Although synchronize_rcu() is slower than is call_rcu(), it - usually results in simpler code. So, unless update performance - is critically important or the updaters cannot block, - synchronize_rcu() should be used in preference to call_rcu(). + usually results in simpler code. So, unless update performance is + critically important, the updaters cannot block, or the latency of + synchronize_rcu() is visible from userspace, synchronize_rcu() + should be used in preference to call_rcu(). Furthermore, + kfree_rcu() usually results in even simpler code than does + synchronize_rcu() without synchronize_rcu()'s multi-millisecond + latency. So please take advantage of kfree_rcu()'s "fire and + forget" memory-freeing capabilities where it applies. An especially important property of the synchronize_rcu() primitive is that it automatically self-limits: if grace periods @@ -268,7 +273,8 @@ over a rather long period of time, but improvements are always welcome! e. Periodically invoke synchronize_rcu(), permitting a limited number of updates per grace period. - The same cautions apply to call_rcu_bh() and call_rcu_sched(). + The same cautions apply to call_rcu_bh(), call_rcu_sched(), + call_srcu(), and kfree_rcu(). 9. All RCU list-traversal primitives, which include rcu_dereference(), list_for_each_entry_rcu(), and @@ -296,9 +302,9 @@ over a rather long period of time, but improvements are always welcome! all currently executing rcu_read_lock()-protected RCU read-side critical sections complete. It does -not- necessarily guarantee that all currently running interrupts, NMIs, preempt_disable() - code, or idle loops will complete. Therefore, if you do not have - rcu_read_lock()-protected read-side critical sections, do -not- - use synchronize_rcu(). + code, or idle loops will complete. Therefore, if your + read-side critical sections are protected by something other + than rcu_read_lock(), do -not- use synchronize_rcu(). Similarly, disabling preemption is not an acceptable substitute for rcu_read_lock(). Code that attempts to use preemption @@ -401,9 +407,9 @@ over a rather long period of time, but improvements are always welcome! read-side critical sections. It is the responsibility of the RCU update-side primitives to deal with this. -17. Use CONFIG_PROVE_RCU, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and - the __rcu sparse checks to validate your RCU code. These - can help find problems as follows: +17. Use CONFIG_PROVE_RCU, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and the + __rcu sparse checks (enabled by CONFIG_SPARSE_RCU_POINTER) to + validate your RCU code. These can help find problems as follows: CONFIG_PROVE_RCU: check that accesses to RCU-protected data structures are carried out under the proper RCU diff --git a/Documentation/RCU/lockdep.txt b/Documentation/RCU/lockdep.txt index a102d4b3724b..cd83d2348fef 100644 --- a/Documentation/RCU/lockdep.txt +++ b/Documentation/RCU/lockdep.txt @@ -64,6 +64,11 @@ checking of rcu_dereference() primitives: but retain the compiler constraints that prevent duplicating or coalescsing. This is useful when when testing the value of the pointer itself, for example, against NULL. + rcu_access_index(idx): + Return the value of the index and omit all barriers, but + retain the compiler constraints that prevent duplicating + or coalescsing. This is useful when when testing the + value of the index itself, for example, against -1. The rcu_dereference_check() check expression can be any boolean expression, but would normally include a lockdep expression. However, diff --git a/Documentation/RCU/rcubarrier.txt b/Documentation/RCU/rcubarrier.txt index 38428c125135..2e319d1b9ef2 100644 --- a/Documentation/RCU/rcubarrier.txt +++ b/Documentation/RCU/rcubarrier.txt @@ -79,7 +79,20 @@ complete. Pseudo-code using rcu_barrier() is as follows: 2. Execute rcu_barrier(). 3. Allow the module to be unloaded. -The rcutorture module makes use of rcu_barrier in its exit function +There are also rcu_barrier_bh(), rcu_barrier_sched(), and srcu_barrier() +functions for the other flavors of RCU, and you of course must match +the flavor of rcu_barrier() with that of call_rcu(). If your module +uses multiple flavors of call_rcu(), then it must also use multiple +flavors of rcu_barrier() when unloading that module. For example, if +it uses call_rcu_bh(), call_srcu() on srcu_struct_1, and call_srcu() on +srcu_struct_2(), then the following three lines of code will be required +when unloading: + + 1 rcu_barrier_bh(); + 2 srcu_barrier(&srcu_struct_1); + 3 srcu_barrier(&srcu_struct_2); + +The rcutorture module makes use of rcu_barrier() in its exit function as follows: 1 static void diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index 1927151b386b..e38b8df3d727 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -92,14 +92,14 @@ If the CONFIG_RCU_CPU_STALL_INFO kernel configuration parameter is set, more information is printed with the stall-warning message, for example: INFO: rcu_preempt detected stall on CPU - 0: (63959 ticks this GP) idle=241/3fffffffffffffff/0 + 0: (63959 ticks this GP) idle=241/3fffffffffffffff/0 softirq=82/543 (t=65000 jiffies) In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is printed: INFO: rcu_preempt detected stall on CPU - 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer not pending + 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 softirq=82/543 last_accelerate: a345/d342 nonlazy_posted: 25 .D (t=65000 jiffies) The "(64628 ticks this GP)" indicates that this CPU has taken more @@ -116,13 +116,28 @@ number between the two "/"s is the value of the nesting, which will be a small positive number if in the idle loop and a very large positive number (as shown above) otherwise. -For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the CPU is -not in the process of trying to force itself into dyntick-idle state, the -"." indicates that the CPU has not given up forcing RCU into dyntick-idle -mode (it would be "H" otherwise), and the "timer not pending" indicates -that the CPU has not recently forced RCU into dyntick-idle mode (it -would otherwise indicate the number of microseconds remaining in this -forced state). +The "softirq=" portion of the message tracks the number of RCU softirq +handlers that the stalled CPU has executed. The number before the "/" +is the number that had executed since boot at the time that this CPU +last noted the beginning of a grace period, which might be the current +(stalled) grace period, or it might be some earlier grace period (for +example, if the CPU might have been in dyntick-idle mode for an extended +time period. The number after the "/" is the number that have executed +since boot until the current time. If this latter number stays constant +across repeated stall-warning messages, it is possible that RCU's softirq +handlers are no longer able to execute on this CPU. This can happen if +the stalled CPU is spinning with interrupts are disabled, or, in -rt +kernels, if a high-priority process is starving RCU's softirq handler. + +For CONFIG_RCU_FAST_NO_HZ kernels, the "last_accelerate:" prints the +low-order 16 bits (in hex) of the jiffies counter when this CPU last +invoked rcu_try_advance_all_cbs() from rcu_needs_cpu() or last invoked +rcu_accelerate_cbs() from rcu_prepare_for_idle(). The "nonlazy_posted:" +prints the number of non-lazy callbacks posted since the last call to +rcu_needs_cpu(). Finally, an "L" indicates that there are currently +no non-lazy callbacks ("." is printed otherwise, as shown above) and +"D" indicates that dyntick-idle processing is enabled ("." is printed +otherwise, for example, if disabled via the "nohz=" kernel boot parameter). Multiple Warnings From One Stall diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index 0cc7820967f4..10df0b82f459 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -265,9 +265,9 @@ rcu_dereference() rcu_read_lock(); p = rcu_dereference(head.next); rcu_read_unlock(); - x = p->address; + x = p->address; /* BUG!!! */ rcu_read_lock(); - y = p->data; + y = p->data; /* BUG!!! */ rcu_read_unlock(); Holding a reference from one RCU read-side critical section diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 4609e81dbc37..22303b2e74bc 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2461,9 +2461,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. In kernels built with CONFIG_RCU_NOCB_CPU=y, set the specified list of CPUs to be no-callback CPUs. Invocation of these CPUs' RCU callbacks will - be offloaded to "rcuoN" kthreads created for - that purpose. This reduces OS jitter on the + be offloaded to "rcuox/N" kthreads created for + that purpose, where "x" is "b" for RCU-bh, "p" + for RCU-preempt, and "s" for RCU-sched, and "N" + is the CPU number. This reduces OS jitter on the offloaded CPUs, which can be useful for HPC and + real-time workloads. It can also improve energy efficiency for asymmetric multiprocessors. @@ -2487,6 +2490,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted. leaf rcu_node structure. Useful for very large systems. + rcutree.jiffies_till_first_fqs= [KNL,BOOT] + Set delay from grace-period initialization to + first attempt to force quiescent states. + Units are jiffies, minimum value is zero, + and maximum value is HZ. + + rcutree.jiffies_till_next_fqs= [KNL,BOOT] + Set delay between subsequent attempts to force + quiescent states. Units are jiffies, minimum + value is one, and maximum value is HZ. + rcutree.qhimark= [KNL,BOOT] Set threshold of queued RCU callbacks over which batch limiting is disabled. @@ -2501,16 +2515,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted. rcutree.rcu_cpu_stall_timeout= [KNL,BOOT] Set timeout for RCU CPU stall warning messages. - rcutree.jiffies_till_first_fqs= [KNL,BOOT] - Set delay from grace-period initialization to - first attempt to force quiescent states. - Units are jiffies, minimum value is zero, - and maximum value is HZ. + rcutree.rcu_idle_gp_delay= [KNL,BOOT] + Set wakeup interval for idle CPUs that have + RCU callbacks (RCU_FAST_NO_HZ=y). - rcutree.jiffies_till_next_fqs= [KNL,BOOT] - Set delay between subsequent attempts to force - quiescent states. Units are jiffies, minimum - value is one, and maximum value is HZ. + rcutree.rcu_idle_lazy_gp_delay= [KNL,BOOT] + Set wakeup interval for idle CPUs that have + only "lazy" RCU callbacks (RCU_FAST_NO_HZ=y). + Lazy RCU callbacks are those which RCU can + prove do nothing more than free memory. rcutorture.fqs_duration= [KNL,BOOT] Set duration of force_quiescent_state bursts. |