Age | Commit message (Collapse) | Author |
|
During boot there is a possibility for two execution context to create
stats percpu variable concurrently. One called from cpufreq_stats_init
context and other as a part of policy notifier call back. This will
result in corrupted stats variable.
Disable cpu hotplug to avoid corruption.
Change-Id: Iefe2d6b370f6ec303286afc139fa9913fa9a4099
Suggested-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Arun KS <arunks@codeaurora.org>
|
|
Currently the load argument is taken as unsigned long in
sl_busy_to_laf. In case of 32-bit kernels, the use of
unsigned long results in overflows since it is only 32-bits.
And so the cpu_load calculation is going wrong and most of
the times getting reported very low values. Hence use
mult_frac call to avoid overflows when the final result
is expected to be within 32-bits.
Change-Id: Ib9e8bf6e777cd07b141761fb14c80840563b4cd5
Signed-off-by: Hanumath Prasad <hpprasad@codeaurora.org>
|
|
With modification in scheduler, governor now gets predicted
instantaneous demand waiting to run in addition to demand from
previous window for each CPU. Make use of this information since
prediction from scheduler could be more accurate than just looking at
past few windows.
Governor calculates two frequencies during each sampling period: one based
on demand in previous sampling period (f_prev), and the other based on
prediction provided by scheduler (f_pred). Max of both will be selected
as final frequency. Hispeed related logic, including both frequency
selection and delay is ignored when prediction is enabled. If only
f_pred but not f_prev picked policy->max, max_freq_hysteresis period is
not started/extended. This is to reduce power cost of mis-prediction
if it happens.
One use case prediction could dramatically help is when a heavy task
wakes up after sleeping for a long time. With prediction, governor
could ramp up to frequency the task needs much faster than before.
To enable prediction, echo 1 to enable_prediction file in
cpufreq interactive sysfs directory.
Change-Id: I27396785886e43ea01c9000c651c8bd142172273
Suggested-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Multiple migrations can happen together within short period if
scheduler is re-arranging a few tasks. In this case, it's only useful
to change frequency at the end of all migrations. Delay handling of
scheduler notification by 1ms.
Change-Id: I9ee7b1e93ce57c28919b5609c40dcde9bd14abed
Suggested-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Scheduler could send a notification to governor each time a task wakes
up. If governor wakes up another task as a response to such a
notification, it could result in endless recursive notifications.
Use wake_up_process_no_notif to ensure scheduler won't send another
notification for speedchange task woken up by the governor.
Change-Id: I697affcbdf79e2ad0cfe843eb880d304960682f4
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
CPU scaling for thread migration is now handled by scheduler and
governor. Remove migration related boost feature from cpu-boost.
Change-Id: I36f58e54eaceae30a3d0c11d73b1aadc4787db4e
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Scheduler provides different load number based on whether a
notification is pending. Under normal situation, it won't provide a
load that exceeds 100% busy time of current frequency. For migration,
the busy time can be huge if a heavy task just moved to the CPU.
This creates a race condition due to how governor handles
notification:
1) Scheduler sends notification for a big task
2) Governor timer runs, and gets a huge load, but fails to skip
hispeed_freq logic and all delays because it's not a notification
3) After receiving sched_get_cpus_busy(), scheduler thinks governor has
finished handling the notification and changes to provide normal load
that is capped to 100% of the CPU at current frequency.
4) Governor now starts handling notification, but gets a small load
that doesn't reflect real demand of the heavy task.
The migration notification is thus effectively lost. Fixing this by
making notification pending a per-cpu flag. If timer gets ahead of
notification handling, it will be run as if it's a notification.
Change-Id: Ie3d68edf85b822232a646c2694bec6928a2d7cd1
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
prev_load could be zero if no active time is registered for a CPU
within a sampling period. Fix potential divide-by-zero issue when
calculating new load percentage.
Change-Id: I8ad118f5b6b94a410ec59eb5ce939b9467e921c7
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Signed-off-by: Hanumath Prasad <hpprasad@codeaurora.org>
|
|
When the existing code computes the target frequency, it limits the target
frequency to be within policy min/max. It does this to make sure the
governor doesn't set the CPU frequency to something outside the policy
min/max limits.
The problem with this is that when the limits are removed, the CPU
frequency takes time to catch up with the real load because the governor
needs to wait for the next recalculation and even when the recalculated
frequency is correct, hysteresis might be applied.
In reality, the load might have already been consistent enough to exceeded
the hysteresis criteria and cause a frequency change if it wasn't for the
policy limits. However, since the policy min/max limits the target
frequency from reflecting the increased need, the hysteresis criteria
doesn't get a chance to expire.
Since the CPUfreq framework already takes care of limiting the governor's
request to be within the policy min/max limits before it sets the CPU
frequency, there's no need to limit the computation of target frequency to
be within policy min/max.
That way, when limits are removed, we can use the current target frequency
as is and immediately jump to a CPU frequency that's appropriate for the
current load.
Change-Id: Idc02359f6ff91530ff69de8edd8a25c275642099
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
|
|
New tasks don't have sufficient history to predict its behavior, even
with scheduler's help. Ramping up conservatively for a heavy task
could hurt performance when it's needed. Therefore, separate out new
tasks' load with scheduler's help and ramp up more aggressively if new
tasks make up a significant portion of total load.
Change-Id: Ia95c956369edb9b7a0768f3bdcb0b2fab367fdf7
Suggested-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Account amount of load contributed by new tasks within CPU load so that
governor can apply different policy when CPU is loaded by new tasks.
To be able to distinguish new task load a new tunable
sched_new_task_windows also introduced. The tunable defines tasks as new
when the tasks are have been active less than configured windows.
Change-Id: I2e2e62e4103882f7362154b792ab978b181b9f59
Suggested-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
[junjiew@codeaurora.org: Dropped all changes on scheduler side because
those have been merged separately.]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
max_freq_hysteresis keeps CPU at policy->max even after load goes
away. This is necessary for some workloads where heavy task start and
stop often. However, in case heavy task indeed stops, it's not very
power friendly to stay at policy->max for extended period.
Instead of keeping CPU at policy->max, drop frequency optimistically.
If a heavy load starts back up again and hit go_hispeed_load within
max_freq_hysteresis period, directly ramp back up to policy->max.
Change-Id: I5edf6d765a3599a5b26e13e584bd237e932593f0
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
CPU load is now normalized to per-policy target_load, instead of
current frequency of CPU. Fix cpufreq_interactive_cpuload accordingly
so that its load number matches other cpufreq interactive events like
cpufreq_interactive_target/notyet/already.
Change-Id: I0685b5930ad1bac01819e96fcdfc181167d4dae0
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
When governor gets a notification from scheduler, scheduler provides
exact load that is required by the workload. Ignore hispeed_freq logic
and directly use choose_freq result for notifications.
Also use is_notif field to distinguish notifications instead of
MAX_LOCAL_LOAD.
Change-Id: I409ea66c00f4277adf32d18c339631e1a8b0f97b
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Scheduler needs to understand governor's target_load in order to make
correct decisions when scheduling tasks.
Change-Id: Ia440986de813632def0352e34425fa69da3b2923
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
|
|
With per-policy timer implemented, there is no need to use policy->cur
in load calculation and delay enforcement. Each CPUs in policy will
naturally get the cluster frequency in target_freq. Using policy->cur
has side effects if second evaluation comes before frequency switch
requested by first evaluation is finished. When that occurs, the second
evalution could enforce delays incorrectly based on the stale
policy->cur while the timestamps have been updated when target_freq is
updated by earlier evaluation.
For example, assume current frequency is 1.5GHz, hispeed_freq is 1GHz.
First evaluation drops target_freq to 500MHz. It also resets
hispeed_validate_time. While frequency switch is still underway and
policy->cur is still 1.5GHz, a second evaluation happens, and the
evaluation result is 1GHz. Current evaluation would enforce
hispeed_delay for 1.5GHz using the updated hispeed_validate_time and
thus incorrectly delaying the ramp up to 1GHz.
Change from policy->cur to target_freq in load calculation and delay
enforcement.
Change-Id: I416e1d524e14b2c082944b88678eb3105bd70d88
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Commit 92352c0a65bc ("cpufreq: interactive: Ramp up directly if
cpu_load exceeds 100") and commit 594945e67031 ("cpufreq: interactive:
Skip delay in frequency changes due to migration") allow interactive
governor to skip above_hispeed_delay and min_sample_time if the
frequency evaluation request comes from scheduler. Power and performance
benefits of these two features are dependent on the behavior of each
workload. Adverse load pattern may experience regression instead of
improvement.
Make both features optional by introducing a sysfs file for each. Both
features are disabled by default.
Change-Id: I394c7fac00e6b20259dd198bd526a32ead54f14e
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
sched_get_cpus_busy() provides a snapshot of all CPUs' busy time
information for the set of CPUs being queried. This avoids race
condition due to migration when CPU load is queried one by one.
Change-Id: I6afdfa74ff9f3ef616872df4e2c3bb04f6233c3f
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Slack timer's expire field was not correctly initialized if slack_only
is true in cpufreq_interactive_timer_resched(). This causes both
compilation warning and functional breakage.
Fix expire field by setting it properly.
Change-Id: I2f8c454d63626876522c163eb8d3c5d1c8adfd51
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
With per-cluster timer implementation, only max load across CPUs in
cluster is traced in timer function. Add cpufreq_interactive_cpuload
trace to provide per-cpu load information.
Change-Id: Icea9f2574332a4bc472b14193e77d76100a896ed
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Interactive governor currently uses per-cpu timer to evaluate each
CPU's frequency. For policies that manages multiples CPUs, each CPU
runs its own algorithm to decide its frequency and then final result
is aggregated in speedchange task. This implementation has a few
drawbacks.
Due to the use of deferrable timers, timers between CPUs can be easily
misaligned. If a load migrates from CPU A to CPU B, there exists a gap
where CPU A could have dropped its frequency vote yet CPU B hasn't
seen the demand to ramp up its vote. This would result in an incorrect
drop in policy frequency which is harmful for performance.
In addition, for CPU waking up in middle of a window, the timestamps
it takes will not be aligned with jiffy boundaries, and thus when next
time timer fires, it could incorrectly prevent frequency ramp up/down
for one more window.
Change-Id: Ia82c7b0cff5bb1ea165fb83fbb7a5546ea7d0396
[junjiew@codeaurora.org: Resolved merge conflicts. ]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
first_cpu field was introduced to handle tunable save and restore, but
later improvements removed the need for it. Remove it from
cpufreq_interactive_cpuinfo struct.
Change-Id: Ib6fd7546451ee537f55d874f93d0e52bec58f124
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Set use_sched_load tunable early in store so that we pass
the correct 64-bit jiffy to scheduler.
Change-Id: I46ed73441c9d242f15e5759360d0cea4a9dd23d0
Signed-off-by: Hanumath Prasad <hpprasad@codeaurora.org>
|
|
There is a race window as explained below when governor tries to change
the cpu frequency and some other thread (say thermal mitigation) try to
change the policy limits simultaneously.
speedchange task (ThreadA) Thread B(say Thermal)
cpufreq_interactive_speedchange_task()
|
__cpufreq_driver_target()
|
set_cpu_freq()
|
cpufreq_update_policy()
|
modified policy_max
|
check policy->curr against
new policy limits,return
without calling
__cpufreq_driver_target as
policy->curr(which is not
updated by ThreadA) is still
within the new policy limits.
|
sent CPUFREQ_POSTCHANGE notification
|
updated policy->cur which happens to be higher than policy->max
This results the current frequency being higher than the policy->max and
violating the policy limits. This causes thermal impact and in turn high
power consumption. So Fix this by calling __cpufreq_driver_target() always
with current frequency and leave it to __cpufreq_driver_target() to
guarantee there is no race condition when multiple threads are changing
frequencies.
Change-Id: I9136e9245677e8fc90a628d3099aca8d63d3677c
Signed-off-by: Hanumath Prasad <hpprasad@codeaurora.org>
|
|
When tunables are not available for events other than
CPUFREQ_GOV_POLICY_INIT in cpufreq_governor_interactive(), trigger a
panic instead of throwing a warning.
When the original warning happens, some race condition must have
occurred, and governor will be in a bad state even if it might still
run for a while. Panic directly so that it's easier to catch the
first race event.
Change-Id: I2dc1185cabfe72a63739452731fe242924d2cf45
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
The above hispeed delay and min sample time delays are used to
distinguish between sporadic load changes versus steady state load
changes. The governor tried to make sure the frequency changes only
when the load change is a steady state load change.
However, when the load change is for predictable reasons like
migration, the delays only negatively affect performance and power.
Once a significant load is migrated into a CPU, it's fairly reasonable
to assume it's going to continue contributing that additional load.
Similarly once a significant load is migrated away from a CPU, it's
fairly reasonable to assume the load will be gone forever. Future
migrations can bring back a load or take it away, but the
notifications that come along with it will allow us to quickly correct
for it. For this reason, when the load change is due to a
notification, do not delay frequency changes.
Change-Id: I19ad294b599e30654fbbeb0c56e8b50b0e19198f
[junjiew@codeaurora.org: Resolved merge conflicts.]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
When a CPU is running at policy->min, slack timer will not be scheduled.
If policy->min is reduced later, current implementation doesn't
reschedule slack timer and thus could leave CPU at a higher
frequency indefinitely as long as the CPU is idle. This behavior is
undesirable from power perspective.
Change-Id: I40bfd7c93ad3fd06e3837dc48befdc07f29c78c8
[junjiew@codeaurora.org: Resolved merge conflicts.]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
When governor is using regular busy time tracking, cpu_load will
never exceed 100 because busy time will never exceed elapsed time in
any one sampling window. The only exception is when frequency is
reduced in middle of a window (e.g. due to thermal throttling). In
this case, cpu_load is likely irrelevant since current frequency
governor has been voting is already higher than what target can run
at.
However, on a heterogeneous CPU system with scheduler input enabled
to track the load of migrated tasks, cpu_load could also exceed 100
when a task migrates from more capable CPU to slower CPU. When this
happens, governor already knows the exact frequency required to handle
this load. There is no need to progressively ramp up frequency in order
to assess the load's real demand. It's not desirable to starve such a
migrating task by forcing it through ramping up process on the slower
CPU.
Direclty jump beyond hispeed_freq and ignore above_hispeed_delay if
cpu_load exceeds 100.
Change-Id: Ib87057e4f00732fad943ab595a33e3059494ef15
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Current implementation of cpufreq_interactive_enable_sched_input()
returns early if use_sched_input is already enabled. This breaks
refcounting for migration notification registration. It could also
result in failure of registering migration notification after
hotplugging the entire cluster and/or suspend/resume.
Change-Id: I079b2c70b182f696cd8a883f5c8e3a37b5c6d21d
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
down_read_trylock is not always non-blocking if the same thread calls
down_write() before.
CPU1 CPU2
down_read()
down_write()
__down_write_nested()
schedule()
__down_read_trylock()
up_read()
acquires sem->wait_lock
__rwsem_wake_one_writer()
tries to lock sem->wait_lock
Now CPU2 is waiting for CPU1's schedule() to complete, while holding
sem->wait_lock. CPU1 needs sem->wait_lock to continue.
This problem only happens after cpufreq_interactive introduced load
change notification that could be called within schedule().
Add a separate flag to ignore notification if current thread is in
middle of down_write(). This avoids attempting to hold sem->wait_lock.
The additional flag doesn't have any side effects because
down_read_trylock() would have failed anyway.
Change-Id: Iff97cac36c170cf6d03f36de695141289c3d6930
[junjiew@codeaurora.org: Resolved merge conflicts. Dropped changes
to code that no longer exists.]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Report CPU load to modules subscribed to cpufreq govinfo notification
chain every time governor timer expires to evaluate load.
Change-Id: I0b35947b1924c179649aafa0b7b93d974164af1a
[junjiew@codeaurora.org: Resolved trivial merge conflicts]
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
|
|
Disable sample window alignment by default to match default behavior
of upstream interactive governor.
Change-Id: Ibbf4bdd4dd423f97d3a9dd5442eba78b378e66e2
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Previously, there was a limitation in load change callback that it
can't attempt to wake up a task. Therefore the best we can do is to
schedule timer at current jiffy. The timer function will only be
executed at next timer tick. This could take up to 10ms.
Now that this limitation is removed, re-evaluate load immediately upon
receiving this callback.
Change-Id: Iab3de4705b9aae96054655b1541e32fb040f7e60
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Make sampling window alignment optional when scheduler inputs
are not enabled.
Change-Id: If69c111a3efe219cdd1e38c1f46f03404789c0bb
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Previously known as sampling down factor, max_freq_hysteresis
extends the period that interactive governor will stay at policy->max.
This feature is to accomodate short idle periods in an otherwise very
intensive workload.
When the feature is enabled, it ensures that once a CPU goes to max
frequency, it doesn't reduce the frequency for max_freq_hysteresis
microseconds from the time it first goes to idle.
Change-Id: Ia54985cb554f63f8c22d0b554a0a0f2ed2be038f
[junjiew@codeaurora.org: Resolved conflicts. Dropped changes to code
that no longer exists. Trivial checkpatch fix. Renamed
max_freq_idle_start_time to max_freq_hyst_start_time.]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Interactive governor does not have enough information about the tasks
on a CPU to make a more informed decision on the frequency the CPUs
should run at. To address this problem, modify interactive governor
to get load information from scheduler. In addition, it can get
notification from scheduler on significant load change to reevaluate
CPU frequency immediately.
Add two sysfs file to control the behavior of load evaluation:
use_sched_load:
When enabled, governor uses load information from scheduler
instead of busy/idle time from past window.
use_migration_notif:
Whenever a task migrates, scheduler might send a notification
so that governor can re-evaluate load and scale frequency.
Governor will ignore this notification unless both
use_sched_hint and use_migration_notification are true for
the policy group.
Change-Id: Iaf66e424c6166ec15480db027002b3a3b357d79c
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Replace mod_timer_pinned() with del_timer(), add_timer_on().
mod_timer_pinned() always adds timer onto current CPU. Interactive
governor expects each CPU's timers to be running on the same CPU.
If cpufreq_interactive_timer_resched() is called from another CPU,
the timer will be armed on the wrong CPU.
Replacing mod_timer_pinned() with del_timer() and add_timer_on()
guarantees timers are still run on the right CPU even if another
CPU reschedules the timer. This would provide more flexibility
for future changes.
Change-Id: I3a10be37632afc0ea4e0cc9c86323b9783b216b1
[junjiew@codeaurora.org: Dropped changes that are no longer needed
due to removal of relevant code]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Currently, tunables are only saved to per_cpu field when
CPUFREQ_GOV_POLICY_EXIT event happens. Save tunables the moment they
are created so that per_cpu cached_tunables field always matches
the tunables in use. This is useful for modifying tunable values
across clusters.
Change-Id: I9e30d5e93d6fde1282b5450458d8a605d568a0f5
[junjiew@codeaurora.org: Resolved trivial conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
It's more advantageous to evaluate all CPUs at same time so that
interactive governor gets a complete picture of the load on
each CPU at a specific time. It could also reduce number of speed
changes made if there are many CPUs controlled by same policy. In
addition, waking up all CPUs at same time would allow the cluster
to go into a deeper sleep state when it's idle.
Change-Id: I6915050c5339ef1af106eb906ebe4b7c618061e2
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Interactive governor already has a per_cpu field cpuinfo to keep track
of per_cpu data. Move cached_tunables into cpuinfo.
Change-Id: I77fda0cda76b56ff949456a95f96d129d877aa7b
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
To avoid multiple frees of an allocated tunables struct during
module_exit(), the pointer to the allocated tunables should be stored in
only one of the per-CPU cached_tunables pointer.
So, in the case of per policy governor configuration, store the cached
values in the pointer of first CPU in a policy. In the case of one governor
across all policies, store it in the CPU0 pointer.
Change-Id: Id4334246491519ac91ab725a8758b2748f743bb0
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
|
|
Userspace might change tunable values for a governor. Currently, if
all CPUs in a policy go offline, governor frees its tunable. This
wipes out all userspace modifications. Kernel drivers can call
cpu_up/down() directly and thus userspace won't have a chance to
restore the tunables.
Permanently save tunable struct in a per_cpu field so that we
preserve tunable values across hotplug, suspend/resume and governor
switch.
Change-Id: I126b8278c8e75c8eadb3e2ddfe97fcc72cddfa23
[junjiew@codeaurora.org: Resolved merge conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Change-Id: I9bb41acc4c86074c2c14562f34480004184494f7
[junjiew@codeaurora.org: resolved trivial merge conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Many subsystems depend on cpufreq API for CPU frequency scaling.
Cpufreq API is expected to fail until cpufreq device registers.
Change pr_debug() to pr_info() so that user could determine when
cpufreq API becomes available during boot from kernel messages. This
is crucial to understand whether a cpufreq API failure is benign
during early boot.
Change-Id: Id2dfa009ae33859ec3efcdb29a3296e891852c6a
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Governor error messages point to important failures in governor or
framework. Output triggering CPU and policy->cpu to help debugging.
Resolved conflicts for 3.18 kernel.
Change-Id: I4c5c392ec973b764ec3240bb2eb455c624bcaf63
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
cpufreq_frequency_get_table could return NULL. Do error check on the
return value instead of continue with a potentially NULL pointer.
Change-Id: I0cb8a3a8ae3499e738683e5f45271aeadee488f6
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
__cpufreq_driver_target() checks if policy->cur is same as target_freq
without holding any lock. This function is used by governor to
directly set CPU frequency. Governor calling this function can't hold
any CPUfreq framework locks due to deadlock possibility.
However, this results in a race condition where one thread could see
a stale policy->cur while another thread is changing CPU frequency.
Thread A: Governor calls __cpufreq_driver_target(), starts increasing
frequency but hasn't sent out CPUFREQ_POSTCHANGE notification yet.
Thread B: Some other driver (could be thermal mitigation) starts
limiting frequency using cpufreq_update_policy(). Every limits are
applied to policy->min/max and final policy->max happens to be same as
policy->cur. __cpufreq_driver_target() simply returns 0.
Thread A: Governor finish scaling and now policy->cur violates
policy->max and could last forever until next CPU frequency scaling
happens.
Shifting the responsibility of checking policy->cur and target_freq
to CPUfreq device driver would resolve the race as long as the device
driver holds a common mutex.
Change-Id: I6f943228e793a4a4300c58b3ae0143e09ed01d7d
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
Some modules can benefit from getting additional information cpufreq
governors use to make frequency switch decisions.
This change lays down a basic framework that the governors can use
to report additional information (Eg: CPU's load) information to
the clients that subscribe to cpufreq govinfo notifier chain.
Change-Id: I511b4bdb7d12394a31ce5352ae47553861e49303
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
[imaund@codeaurora.org: resolved context conflicts]
Signed-off-by: Ian Maund <imaund@codeaurora.org>
|
|
Frequency table is allocated with devm_kzalloc() and thus should be
freed using devm_kfree().
Change-Id: I9c08838eadb9fc04bda9cc66596e1e0b45b3e4db
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|
|
CPUfreq framework replaced per-cpu freq_table with per-policy
freq_table, and deprecated previous per-cpu APIs.
Fill in policy->freq_table.
Change-Id: Ifc9ac1b6695fd12629a447984dbbd57d657961b2
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
|