summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-06-02UPSTREAM: sched/core: Introduce SD_ASYM_CPUCAPACITY sched_domain topology flagMorten Rasmussen
Add a topology flag to the sched_domain hierarchy indicating the lowest domain level where the full range of CPU capacities is represented by the domain members for asymmetric capacity topologies (e.g. ARM big.LITTLE). The flag is intended to indicate that extra care should be taken when placing tasks on CPUs and this level spans all the different types of CPUs found in the system (no need to look further up the domain hierarchy). This information is currently only available through iterating through the capacities of all the CPUs at parent levels in the sched_domain hierarchy. SD 2 [ 0 1 2 3] SD_ASYM_CPUCAPACITY SD 1 [ 0 1] [ 2 3] !SD_ASYM_CPUCAPACITY CPU: 0 1 2 3 capacity: 756 756 1024 1024 If the topology in the example above is duplicated to create an eight CPU example with third sched_domain level on top (SD 3), this level should not have the flag set (!SD_ASYM_CPUCAPACITY) as its two group would both have all CPU capacities represented within them. Change-Id: I1526407b90567cac387419719b7d7fdc8b259a85 Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: freedom.tan@mediatek.com Cc: keita.kobayashi.ym@renesas.com Cc: mgalbraith@suse.de Cc: sgurrappadi@nvidia.com Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1469453670-2660-6-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 1f6e6c7cb9bcd58abb5ee11243e0eefe6b36fc8e) [trivial merge conflict] Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2017-06-02UPSTREAM: sched/core: Remove unnecessary NULL-pointer checkMorten Rasmussen
Checking if the sched_domain pointer returned by sd_init() is NULL seems pointless as sd_init() neither checks if it is valid to begin with nor set it to NULL. Change-Id: I5e16fd0c2ca7234b097be7c95409ddb15c5e9de9 Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: freedom.tan@mediatek.com Cc: keita.kobayashi.ym@renesas.com Cc: mgalbraith@suse.de Cc: sgurrappadi@nvidia.com Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1469453670-2660-5-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 0e6d2a67a41321b3ef650b780a279a37855de08e) Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2017-06-02UPSTREAM: sched/fair: Optimize find_idlest_cpu() when there is no choiceMorten Rasmussen
In the current find_idlest_group()/find_idlest_cpu() search we end up calling find_idlest_cpu() in a sched_group containing only one CPU in the end. Checking idle-states becomes pointless when there is no alternative, so bail out instead. Change-Id: Ic62bf09b53a7984143ac2431aaa69c69b204cd56 Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: linux-kernel@vger.kernel.org Cc: mgalbraith@suse.de Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1466615004-3503-4-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit eaecf41f5abf80b63c8e025fcb9ee4aa203c3038) Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2017-06-02BACKPORT: sched/fair: Make the use of prev_cpu consistent in the wakeup pathMorten Rasmussen
In commit: ac66f5477239 ("sched/numa: Introduce migrate_swap()") select_task_rq() got a 'cpu' argument to enable overriding of prev_cpu in special cases (NUMA task swapping). However, the select_task_rq_fair() helper functions: wake_affine() and select_idle_sibling(), still use task_cpu(p) directly to work out prev_cpu, which leads to inconsistencies. This patch passes prev_cpu (potentially overridden by NUMA code) into the helper functions to ensure prev_cpu is indeed the same CPU everywhere in the wakeup path. Change-Id: I4951c4eead2e6045e4fb34e89f6cda17d881d4d7 cc: Ingo Molnar <mingo@redhat.com> cc: Rik van Riel <riel@redhat.com> Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: linux-kernel@vger.kernel.org Cc: mgalbraith@suse.de Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1466615004-3503-3-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 772bd008cd9a1d4e8ce566f2edcc61d1c28fcbe5) [merged with Android/EAS wakeup path] Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2017-06-02UPSTREAM: sched/core: Fix power to capacity renaming in commentMorten Rasmussen
It is seems that this one escaped Nico's renaming of cpu_power to cpu_capacity a while back. Change-Id: Ic2569d714db7b740f1df4ccc381ba8c1772c2793 Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: linux-kernel@vger.kernel.org Cc: mgalbraith@suse.de Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1466615004-3503-2-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit bd425d4bfc7a1a6064dbbadfbac9c7eec0e426ec) Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2017-06-02Partial Revert: "WIP: sched: Add cpu capacity awareness to wakeup balancing"Dietmar Eggemann
Revert the changes in find_idlest_cpu() and find_idlest_group(). Keep the infrastructure bits which are used in following EAS patches. Change-Id: Id516ca5f3e51b9a13db1ebb8de2df3aa25f9679b Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
2017-06-02Revert "WIP: sched: Consider spare cpu capacity at task wake-up"Dietmar Eggemann
This reverts commit 75a9695b619741019363f889c99c97c7bb823797. Change-Id: I846b21f2bdeb0b0ca30ad65683564ed07a429428 Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> [ minor merge changes ] Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2017-06-02FROM-LIST: cpufreq: schedutil: Redefine the rate_limit_us tunableViresh Kumar
The rate_limit_us tunable is intended to reduce the possible overhead from running the schedutil governor. However, that overhead can be divided into two separate parts: the governor computations and the invocation of the scaling driver to set the CPU frequency. The latter is where the real overhead comes from. The former is much less expensive in terms of execution time and running it every time the governor callback is invoked by the scheduler, after rate_limit_us interval has passed since the last frequency update, would not be a problem. For this reason, redefine the rate_limit_us tunable so that it means the minimum time that has to pass between two consecutive invocations of the scaling driver by the schedutil governor (to set the CPU frequency). Change-Id: Iced64116b826c25441ef537c27a3dabfcf81919e Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [pulled from linux-pm linux-next https://patchwork.kernel.org/patch/9583949/ ] Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2017-06-02cpufreq: schedutil: add up/down frequency transition rate limitsSteve Muckle
The rate-limit tunable in the schedutil governor applies to transitions to both lower and higher frequencies. On several platforms it is not the ideal tunable though, as it is difficult to get best power/performance figures using the same limit in both directions. It is common on mobile platforms with demanding user interfaces to want to increase frequency rapidly for example but decrease slowly. One of the example can be a case where we have short busy periods followed by similar or longer idle periods. If we keep the rate-limit high enough, we will not go to higher frequencies soon enough. On the other hand, if we keep it too low, we will have too many frequency transitions, as we will always reduce the frequency after the busy period. It would be very useful if we can set low rate-limit while increasing the frequency (so that we can respond to the short busy periods quickly) and high rate-limit while decreasing frequency (so that we don't reduce the frequency immediately after the short busy period and that may avoid frequency transitions before the next busy period). Implement separate up/down transition rate limits. Note that the governor avoids frequency recalculations for a period equal to minimum of up and down rate-limit. A global mutex is also defined to protect updates to min_rate_limit_us via two separate sysfs files. Note that this wouldn't change behavior of the schedutil governor for the platforms which wish to keep same values for both up and down rate limits. This is tested with the rt-app [1] on ARM Exynos, dual A15 processor platform. Testcase: Run a SCHED_OTHER thread on CPU0 which will emulate work-load for X ms of busy period out of the total period of Y ms, i.e. Y - X ms of idle period. The values of X/Y taken were: 20/40, 20/50, 20/70, i.e idle periods of 20, 30 and 50 ms respectively. These were tested against values of up/down rate limits as: 10/10 ms and 10/40 ms. For every test we noticed a performance increase of 5-10% with the schedutil governor, which was very much expected. [Viresh]: Simplified user interface and introduced min_rate_limit_us + mutex, rewrote commit log and included test results. [1] https://github.com/scheduler-tools/rt-app/ Change-Id: I18720a83855b196b8e21dcdc8deae79131635b84 Signed-off-by: Steve Muckle <smuckle.linux@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> (applied from https://marc.info/?l=linux-kernel&m=147936011103832&w=2) [trivial adaptations] Signed-off-by: Juri Lelli <juri.lelli@arm.com>
2017-06-02trace/sched: add rq utilization signal for WALTJuri Lelli
It is useful to be able to check current capacity against rq utilization signal generated by WALT (to check how a cpufreq governor is behaving for example). Add rq utilization signal (same scale as capacity) to the walt_update_ task_ravg tracepoint. Change-Id: I9aae3884a741d23ac494bef80d2303f107f135ce Signed-off-by: Juri Lelli <juri.lelli@arm.com>
2017-06-02sched/cpufreq: make schedutil use WALT signalJuri Lelli
If WALT is available and enabled, make schedutil governor use its utilization signal. Change-Id: I92bc37989447a76616e9bcc4e9e8616774fb9925 Signed-off-by: Juri Lelli <juri.lelli@arm.com> [we need to use boosted_cpu_util for schedutil, so make it not static] Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2017-06-02sched: cpufreq: use rt_avg as estimate of required RT CPU capacitySteve Muckle
A policy of going to fmax on any RT activity will be detrimental for power on many platforms. Often RT accounts for only a small amount of CPU activity so sending the CPU frequency to fmax is overkill. Worse still, some platforms may not be able to even complete the CPU frequency change before the RT activity has already completed. Cpufreq governors have not treated RT activity this way in the past so it is not part of the expected semantics of the RT scheduling class. The DL class offers guarantees about task completion and could be used for this purpose. Modify the schedutil algorithm to instead use rt_avg as an estimate of RT utilization of the CPU. Based on previous work by Vincent Guittot <vincent.guittot@linaro.org>. Change-Id: I1ed605a3e2512a94d34217a8e57c3fd97cca60be Signed-off-by: Steve Muckle <smuckle@linaro.org>
2017-06-02cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO taskViresh Kumar
If slow path frequency changes are conducted in a SCHED_OTHER context then they may be delayed for some amount of time, including indefinitely, when real time or deadline activity is taking place. Move the slow path to a real time kernel thread. In the future the thread should be made SCHED_DEADLINE. The RT priority is arbitrarily set to 50 for now. Hackbench results on ARM Exynos, dual core A15 platform for 10 iterations: $ hackbench -s 100 -l 100 -g 10 -f 20 Before After --------------------------------- 1.808 1.603 1.847 1.251 2.229 1.590 1.952 1.600 1.947 1.257 1.925 1.627 2.694 1.620 1.258 1.621 1.919 1.632 1.250 1.240 Average: 1.8829 1.5041 Based on initial work by Steve Muckle. Change-Id: I8f53037e94f353960c6d10abf07822d671631ef7 Signed-off-by: Steve Muckle <smuckle.linux@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from 02a7b1ee3baa) [adapt to the 3.18 kthread interface] Signed-off-by: Juri Lelli <juri.lelli@arm.com>
2017-06-02BACKPORT: kthread: allow to cancel kthread workPetr Mladek
We are going to use kthread workers more widely and sometimes we will need to make sure that the work is neither pending nor running. This patch implements cancel_*_sync() operations as inspired by workqueues. Well, we are synchronized against the other operations via the worker lock, we use del_timer_sync() and a counter to count parallel cancel operations. Therefore the implementation might be easier. First, we check if a worker is assigned. If not, the work has newer been queued after it was initialized. Second, we take the worker lock. It must be the right one. The work must not be assigned to another worker unless it is initialized in between. Third, we try to cancel the timer when it exists. The timer is deleted synchronously to make sure that the timer call back is not running. We need to temporary release the worker->lock to avoid a possible deadlock with the callback. In the meantime, we set work->canceling counter to avoid any queuing. Fourth, we try to remove the work from a worker list. It might be the list of either normal or delayed works. Fifth, if the work is running, we call kthread_flush_work(). It might take an arbitrary time. We need to release the worker-lock again. In the meantime, we again block any queuing by the canceling counter. As already mentioned, the check for a pending kthread work is done under a lock. In compare with workqueues, we do not need to fight for a single PENDING bit to block other operations. Therefore we do not suffer from the thundering storm problem and all parallel canceling jobs might use kthread_flush_work(). Any queuing is blocked until the counter gets zero. Change-Id: I8a8ece0f93c828f311d0ad5c88d80db2388e4808 Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.com Signed-off-by: Petr Mladek <pmladek@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Borislav Petkov <bp@suse.de> Cc: Michal Hocko <mhocko@suse.cz> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry-picked from 37be45d49dec2a411e29d50c9597cfe8184b5645) [major changes to the original patch while cherry-picking; only rebased the sync variant] Signed-off-by: Juri Lelli <juri.lelli@arm.com>
2017-06-02sched/cpufreq: fix tunables for schedfreq governorSteve Muckle
The schedfreq governor does not currently handle cpufreq drivers which use a global set of tunables (!have_governor_per_policy). For example on x86 and using the acpi cpufreq driver, doing this cat /sys/devices/system/cpu/cpufreq/sched/up_throttle_nsec will result in a bad pointer access. Update the tunable code using the upstream schedutil tunable code by Rafael Wysocki as a guide. Includes a partial backport of the reorganized cpufreq tunable infrastructure. Change-Id: I7e6f8de1dac297077ad43f37dd2f6ddbfe921c98 Signed-off-by: Steve Muckle <smuckle@linaro.org> [fixed cherry-pick issue] Signed-off-by: Juri Lelli <juri.lelli@arm.com> [fixed cherry-pick issue] Signed-off-by: Thierry Strudel <tstrudel@google.com>
2017-06-02BACKPORT: cpufreq: schedutil: New governor based on scheduler utilization dataSteve Muckle
Add a new cpufreq scaling governor, called "schedutil", that uses scheduler-provided CPU utilization information as input for making its decisions. Doing that is possible after commit 34e2c55 (cpufreq: Add mechanism for registering utilization update callbacks) that introduced cpufreq_update_util() called by the scheduler on utilization changes (from CFS) and RT/DL task status updates. In particular, CPU frequency scaling decisions may be based on the the utilization data passed to cpufreq_update_util() by CFS. The new governor is relatively simple. The frequency selection formula used by it depends on whether or not the utilization is frequency-invariant. In the frequency-invariant case the new CPU frequency is given by next_freq = 1.25 * max_freq * util / max where util and max are the last two arguments of cpufreq_update_util(). In turn, if util is not frequency-invariant, the maximum frequency in the above formula is replaced with the current frequency of the CPU: next_freq = 1.25 * curr_freq * util / max The coefficient 1.25 corresponds to the frequency tipping point at (util / max) = 0.8. All of the computations are carried out in the utilization update handlers provided by the new governor. One of those handlers is used for cpufreq policies shared between multiple CPUs and the other one is for policies with one CPU only (and therefore it doesn't need to use any extra synchronization means). The governor supports fast frequency switching if that is supported by the cpufreq driver in use and possible for the given policy. In the fast switching case, all operations of the governor take place in its utilization update handlers. If fast switching cannot be used, the frequency switch operations are carried out with the help of a work item which only calls __cpufreq_driver_target() (under a mutex) to trigger a frequency update (to a value already computed beforehand in one of the utilization update handlers). Currently, the governor treats all of the RT and DL tasks as "unknown utilization" and sets the frequency to the allowed maximum when updated from the RT or DL sched classes. That heavy-handed approach should be replaced with something more subtle and specifically targeted at RT and DL tasks. The governor shares some tunables management code with the "ondemand" and "conservative" governors and uses some common definitions from cpufreq_governor.h, but apart from that it is stand-alone. Change-Id: I03876e622768e4b3ee4dc28682af7cce771f2f4c Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> (cherry-picked from 9bdcb44e391da5c41b98573bf0305a0e0b1c9569) [ Backport the schedutil cpufreq governor from 4.9. Some cpufreq tunable infrastructure as well as the resolve_freq API is also backported as those are dependencies] Signed-off-by: Steve Muckle <smuckle@linaro.org> [trivial cherry-picking fixes] Signed-off-by: Juri Lelli <juri.lelli@arm.com> [fixed default governor machinery] Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2017-06-02sched: backport cpufreq hooks from 4.9-rc4Steve Muckle
The scheduler cpufreq hooks are required by the schedutil cpufreq governor. Change-Id: Ied6c46262bb33b7e81bbb3d3d2761124e0c676b7 Signed-off-by: Steve Muckle <smuckle@linaro.org> [trivial cherry-picking fixes] Signed-off-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2017-06-02ANDROID: Kconfig: add depends for UID_SYS_STATSGanesh Mahendran
uid_io depends on TASK_XACCT and TASK_IO_ACCOUNTING. So add depends in Kconfig before compiling code. Change-Id: Ie6bf57ec7c2eceffadf4da0fc2aca001ce10c36e Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com>
2017-06-01ANDROID: hid: uhid: implement refcount for open and closeDmitry Torokhov
Fix concurrent open and close activity sending a UHID_CLOSE while some consumers still have the device open. Temporary solution for reference counts on device open and close calls, absent a facility for this in the HID core likely to appear in the future. [toddpoynor@google.com: commit text] Bug: 38448648 Signed-off-by: Todd Poynor <toddpoynor@google.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Change-Id: I57413e42ec961a960a8ddc4942228df22c730d80
2017-05-30Revert "ext4: require encryption feature for EXT4_IOC_SET_ENCRYPTION_POLICY"Eric Biggers
This reverts commit e2968fb8e7980dccc199dac2593ad476db20969f. For various reasons, we've had to start enforcing upstream that ext4 encryption can only be used if the filesystem superblock has the EXT4_FEATURE_INCOMPAT_ENCRYPT flag set, as was the intended design. Unfortunately, Android isn't ready for this quite yet, since its userspace still needs to be updated to set the flag at mkfs time, or else fix it later with tune2fs. It will need some more time to be fixed properly, so for now to avoid breaking some devices, revert the kernel change. Bug: 36231741 Signed-off-by: Eric Biggers <ebiggers@google.com> Change-Id: I30bd54afb68dbaf9801f8954099dffa90a2f8df1
2017-05-29ANDROID: mnt: Fix next_descendentDaniel Rosenberg
next_descendent did not properly handle the case where the initial mount had no slaves. In this case, we would look for the next slave, but since don't have a master, the check for wrapping around to the start of the list will always fail. Instead, we check for this case, and ensure that we end the iteration when we come back to the root. Signed-off-by: Daniel Rosenberg <drosen@google.com> Bug: 62094374 Change-Id: I43dfcee041aa3730cb4b9a1161418974ef84812e
2017-05-25Merge 4.4.70 into android-4.4Greg Kroah-Hartman
Changes in 4.4.70 usb: misc: legousbtower: Fix buffers on stack usb: misc: legousbtower: Fix memory leak USB: ene_usb6250: fix DMA to the stack watchdog: pcwd_usb: fix NULL-deref at probe char: lp: fix possible integer overflow in lp_setup() USB: core: replace %p with %pK ARM: tegra: paz00: Mark panel regulator as enabled on boot tpm_crb: check for bad response size infiniband: call ipv6 route lookup via the stub interface dm btree: fix for dm_btree_find_lowest_key() dm raid: select the Kconfig option CONFIG_MD_RAID0 dm bufio: avoid a possible ABBA deadlock dm bufio: check new buffer allocation watermark every 30 seconds dm cache metadata: fail operations if fail_io mode has been established dm bufio: make the parameter "retain_bytes" unsigned long dm thin metadata: call precommit before saving the roots dm space map disk: fix some book keeping in the disk space map md: update slab_cache before releasing new stripes when stripes resizing rtlwifi: rtl8821ae: setup 8812ae RFE according to device type mwifiex: pcie: fix cmd_buf use-after-free in remove/reset ima: accept previously set IMA_NEW_FILE KVM: x86: Fix load damaged SSEx MXCSR register KVM: X86: Fix read out-of-bounds vulnerability in kvm pio emulation regulator: tps65023: Fix inverted core enable logic. s390/kdump: Add final note s390/cputime: fix incorrect system time ath9k_htc: Add support of AirTies 1eda:2315 AR9271 device ath9k_htc: fix NULL-deref at probe drm/amdgpu: Avoid overflows/divide-by-zero in latency_watermark calculations. drm/amdgpu: Make display watermark calculations more accurate drm/nouveau/therm: remove ineffective workarounds for alarm bugs drm/nouveau/tmr: ack interrupt before processing alarms drm/nouveau/tmr: fix corruption of the pending list when rescheduling an alarm drm/nouveau/tmr: avoid processing completed alarms when adding a new one drm/nouveau/tmr: handle races with hw when updating the next alarm time cdc-acm: fix possible invalid access when processing notification proc: Fix unbalanced hard link numbers of: fix sparse warning in of_pci_range_parser_one iio: dac: ad7303: fix channel description pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes pid_ns: Fix race between setns'ed fork() and zap_pid_ns_processes() USB: serial: ftdi_sio: fix setting latency for unprivileged users USB: serial: ftdi_sio: add Olimex ARM-USB-TINY(H) PIDs ext4 crypto: don't let data integrity writebacks fail with ENOMEM ext4 crypto: fix some error handling net: qmi_wwan: Add SIMCom 7230E fscrypt: fix context consistency check when key(s) unavailable f2fs: check entire encrypted bigname when finding a dentry fscrypt: avoid collisions when presenting long encrypted filenames sched/fair: Do not announce throttled next buddy in dequeue_task_fair() sched/fair: Initialize throttle_count for new task-groups lazily usb: host: xhci-plat: propagate return value of platform_get_irq() xhci: apply PME_STUCK_QUIRK and MISSING_CAS quirk for Denverton usb: host: xhci-mem: allocate zeroed Scratchpad Buffer net: irda: irda-usb: fix firmware name on big-endian hosts usbvision: fix NULL-deref at probe mceusb: fix NULL-deref at probe ttusb2: limit messages to buffer size usb: musb: tusb6010_omap: Do not reset the other direction's packet size USB: iowarrior: fix info ioctl on big-endian hosts usb: serial: option: add Telit ME910 support USB: serial: qcserial: add more Lenovo EM74xx device IDs USB: serial: mct_u232: fix big-endian baud-rate handling USB: serial: io_ti: fix div-by-zero in set_termios USB: hub: fix SS hub-descriptor handling USB: hub: fix non-SS hub-descriptor handling ipx: call ipxitf_put() in ioctl error path iio: proximity: as3935: fix as3935_write ceph: fix recursion between ceph_set_acl() and __ceph_setattr() gspca: konica: add missing endpoint sanity check s5p-mfc: Fix unbalanced call to clock management dib0700: fix NULL-deref at probe zr364xx: enforce minimum size when reading header dvb-frontends/cxd2841er: define symbol_rate_min/max in T/C fe-ops cx231xx-audio: fix init error path cx231xx-audio: fix NULL-deref at probe cx231xx-cards: fix NULL-deref at probe powerpc/book3s/mce: Move add_taint() later in virtual mode powerpc/pseries: Fix of_node_put() underflow during DLPAR remove powerpc/64e: Fix hang when debugging programs with relocated kernel ARM: dts: at91: sama5d3_xplained: fix ADC vref ARM: dts: at91: sama5d3_xplained: not all ADC channels are available arm64: xchg: hazard against entire exchange variable arm64: uaccess: ensure extension of access_ok() addr arm64: documentation: document tagged pointer stack constraints xc2028: Fix use-after-free bug properly mm/huge_memory.c: respect FOLL_FORCE/FOLL_COW for thp staging: rtl8192e: fix 2 byte alignment of register BSSIDR. staging: rtl8192e: rtl92e_get_eeprom_size Fix read size of EPROM_CMD. iommu/vt-d: Flush the IOTLB to get rid of the initial kdump mappings metag/uaccess: Fix access_ok() metag/uaccess: Check access_ok in strncpy_from_user uwb: fix device quirk on big-endian hosts genirq: Fix chained interrupt data ordering osf_wait4(): fix infoleak tracing/kprobes: Enforce kprobes teardown after testing PCI: Fix pci_mmap_fits() for HAVE_PCI_RESOURCE_TO_USER platforms PCI: Freeze PME scan before suspending devices drm/edid: Add 10 bpc quirk for LGD 764 panel in HP zBook 17 G2 nfsd: encoders mustn't use unitialized values in error cases drivers: char: mem: Check for address space wraparound with mmap() Linux 4.4.70 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-05-25Linux 4.4.70Greg Kroah-Hartman
2017-05-25drivers: char: mem: Check for address space wraparound with mmap()Julius Werner
commit b299cde245b0b76c977f4291162cf668e087b408 upstream. /dev/mem currently allows mmap() mappings that wrap around the end of the physical address space, which should probably be illegal. It circumvents the existing STRICT_DEVMEM permission check because the loop immediately terminates (as the start address is already higher than the end address). On the x86_64 architecture it will then cause a panic (from the BUG(start >= end) in arch/x86/mm/pat.c:reserve_memtype()). This patch adds an explicit check to make sure offset + size will not wrap around in the physical address type. Signed-off-by: Julius Werner <jwerner@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25nfsd: encoders mustn't use unitialized values in error casesJ. Bruce Fields
commit f961e3f2acae94b727380c0b74e2d3954d0edf79 upstream. In error cases, lgp->lg_layout_type may be out of bounds; so we shouldn't be using it until after the check of nfserr. This was seen to crash nfsd threads when the server receives a LAYOUTGET request with a large layout type. GETDEVICEINFO has the same problem. Reported-by: Ari Kauppi <Ari.Kauppi@synopsys.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/edid: Add 10 bpc quirk for LGD 764 panel in HP zBook 17 G2Mario Kleiner
commit e345da82bd6bdfa8492f80b3ce4370acfd868d95 upstream. The builtin eDP panel in the HP zBook 17 G2 supports 10 bpc, as advertised by the Laptops product specs and verified via injecting a fixed edid + photometer measurements, but edid reports unknown depth, so drivers fall back to 6 bpc. Add a quirk to get the full 10 bpc. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1492787108-23959-1-git-send-email-mario.kleiner.de@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25PCI: Freeze PME scan before suspending devicesLukas Wunner
commit ea00353f36b64375518662a8ad15e39218a1f324 upstream. Laurent Pinchart reported that the Renesas R-Car H2 Lager board (r8a7790) crashes during suspend tests. Geert Uytterhoeven managed to reproduce the issue on an M2-W Koelsch board (r8a7791): It occurs when the PME scan runs, once per second. During PME scan, the PCI host bridge (rcar-pci) registers are accessed while its module clock has already been disabled, leading to the crash. One reproducer is to configure s2ram to use "s2idle" instead of "deep" suspend: # echo 0 > /sys/module/printk/parameters/console_suspend # echo s2idle > /sys/power/mem_sleep # echo mem > /sys/power/state Another reproducer is to write either "platform" or "processors" to /sys/power/pm_test. It does not (or is less likely) to happen during full system suspend ("core" or "none") because system suspend also disables timers, and thus the workqueue handling PME scans no longer runs. Geert believes the issue may still happen in the small window between disabling module clocks and disabling timers: # echo 0 > /sys/module/printk/parameters/console_suspend # echo platform > /sys/power/pm_test # Or "processors" # echo mem > /sys/power/state (Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are enabled.) Rafael Wysocki agrees that PME scans should be suspended before the host bridge registers become inaccessible. To that end, queue the task on a workqueue that gets frozen before devices suspend. Rafael notes however that as a result, some wakeup events may be missed if they are delivered via PME from a device without working IRQ (which hence must be polled) and occur after the workqueue has been frozen. If that turns out to be an issue in practice, it may be possible to solve it by calling pci_pme_list_scan() once directly from one of the host bridge's pm_ops callbacks. Stacktrace for posterity: PM: Syncing filesystems ... [ 38.566237] done. PM: Preparing system for sleep (mem) Freezing user space processes ... [ 38.579813] (elapsed 0.001 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. PM: Suspending system (mem) PM: suspend of devices complete after 152.456 msecs PM: late suspend of devices complete after 2.809 msecs PM: noirq suspend of devices complete after 29.863 msecs suspend debug: Waiting for 5 second(s). Unhandled fault: asynchronous external abort (0x1211) at 0x00000000 pgd = c0003000 [00000000] *pgd=80000040004003, *pmd=00000000 Internal error: : 1211 [#1] SMP ARM Modules linked in: CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted 4.9.0-rc1-koelsch-00011-g68db9bc814362e7f #3383 Hardware name: Generic R8A7791 (Flattened Device Tree) Workqueue: events pci_pme_list_scan task: eb56e140 task.stack: eb58e000 PC is at pci_generic_config_read+0x64/0x6c LR is at rcar_pci_cfg_base+0x64/0x84 pc : [<c041d7b4>] lr : [<c04309a0>] psr: 600d0093 sp : eb58fe98 ip : c041d750 fp : 00000008 r10: c0e2283c r9 : 00000000 r8 : 600d0013 r7 : 00000008 r6 : eb58fed6 r5 : 00000002 r4 : eb58feb4 r3 : 00000000 r2 : 00000044 r1 : 00000008 r0 : 00000000 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 30c5387d Table: 6a9f6c80 DAC: 55555555 Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210) Stack: (0xeb58fe98 to 0xeb590000) fe80: 00000002 00000044 fea0: eb6f5800 c041d9b0 eb58feb4 00000008 00000044 00000000 eb78a000 eb78a000 fec0: 00000044 00000000 eb9aff00 c0424bf0 eb78a000 00000000 eb78a000 c0e22830 fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100 ff20: eb55f398 c02366c4 eb56e140 eb5631c0 00000000 eb55f380 c023641c 00000000 ff40: 00000000 00000000 00000000 c023a928 cd105598 00000000 40506a34 eb55f380 ff60: 00000000 00000000 dead4ead ffffffff ffffffff eb58ff74 eb58ff74 00000000 ff80: 00000000 dead4ead ffffffff ffffffff eb58ff90 eb58ff90 eb58ffac eb5631c0 ffa0: c023a844 00000000 00000000 c0206d68 00000000 00000000 00000000 00000000 ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 3a81336c 10ccd1dd [<c041d7b4>] (pci_generic_config_read) from [<c041d9b0>] (pci_bus_read_config_word+0x58/0x80) [<c041d9b0>] (pci_bus_read_config_word) from [<c0424bf0>] (pci_check_pme_status+0x34/0x78) [<c0424bf0>] (pci_check_pme_status) from [<c0424c5c>] (pci_pme_wakeup+0x28/0x54) [<c0424c5c>] (pci_pme_wakeup) from [<c0424ce0>] (pci_pme_list_scan+0x58/0xb4) [<c0424ce0>] (pci_pme_list_scan) from [<c0235fbc>] (process_one_work+0x1bc/0x308) [<c0235fbc>] (process_one_work) from [<c02366c4>] (worker_thread+0x2a8/0x3e0) [<c02366c4>] (worker_thread) from [<c023a928>] (kthread+0xe4/0xfc) [<c023a928>] (kthread) from [<c0206d68>] (ret_from_fork+0x14/0x2c) Code: ea000000 e5903000 f57ff04f e3a00000 (e5843000) ---[ end trace 667d43ba3aa9e589 ]--- Fixes: df17e62e5bff ("PCI: Add support for polling PME state on suspended legacy PCI devices") Reported-and-tested-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Cc: Simon Horman <horms+renesas@verge.net.au> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25PCI: Fix pci_mmap_fits() for HAVE_PCI_RESOURCE_TO_USER platformsDavid Woodhouse
commit 6bccc7f426abd640f08d8c75fb22f99483f201b4 upstream. In the PCI_MMAP_PROCFS case when the address being passed by the user is a 'user visible' resource address based on the bus window, and not the actual contents of the resource, that's what we need to be checking it against. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25tracing/kprobes: Enforce kprobes teardown after testingThomas Gleixner
commit 30e7d894c1478c88d50ce94ddcdbd7f9763d9cdd upstream. Enabling the tracer selftest triggers occasionally the warning in text_poke(), which warns when the to be modified page is not marked reserved. The reason is that the tracer selftest installs kprobes on functions marked __init for testing. These probes are removed after the tests, but that removal schedules the delayed kprobes_optimizer work, which will do the actual text poke. If the work is executed after the init text is freed, then the warning triggers. The bug can be reproduced reliably when the work delay is increased. Flush the optimizer work and wait for the optimizing/unoptimizing lists to become empty before returning from the kprobes tracer selftest. That ensures that all operations which were queued due to the probes removal have completed. Link: http://lkml.kernel.org/r/20170516094802.76a468bb@gandalf.local.home Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: 6274de498 ("kprobes: Support delayed unoptimizing") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25osf_wait4(): fix infoleakAl Viro
commit a8c39544a6eb2093c04afd5005b6192bd0e880c6 upstream. failing sys_wait4() won't fill struct rusage... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25genirq: Fix chained interrupt data orderingThomas Gleixner
commit 2c4569ca26986d18243f282dd727da27e9adae4c upstream. irq_set_chained_handler_and_data() sets up the chained interrupt and then stores the handler data. That's racy against an immediate interrupt which gets handled before the store of the handler data happened. The handler will dereference a NULL pointer and crash. Cure it by storing handler data before installing the chained handler. Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25uwb: fix device quirk on big-endian hostsJohan Hovold
commit 41318a2b82f5d5fe1fb408f6d6e0b22aa557111d upstream. Add missing endianness conversion when using the USB device-descriptor idProduct field to apply a hardware quirk. Fixes: 1ba47da52712 ("uwb: add the i1480 DFU driver") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25metag/uaccess: Check access_ok in strncpy_from_userJames Hogan
commit 3a158a62da0673db918b53ac1440845a5b64fd90 upstream. The metag implementation of strncpy_from_user() doesn't validate the src pointer, which could allow reading of arbitrary kernel memory. Add a short access_ok() check to prevent that. Its still possible for it to read across the user/kernel boundary, but it will invariably reach a NUL character after only 9 bytes, leaking only a static kernel address being loaded into D0Re0 at the beginning of __start, which is acceptable for the immediate fix. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-metag@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25metag/uaccess: Fix access_ok()James Hogan
commit 8a8b56638bcac4e64cccc88bf95a0f9f4b19a2fb upstream. The __user_bad() macro used by access_ok() has a few corner cases noticed by Al Viro where it doesn't behave correctly: - The kernel range check has off by 1 errors which permit access to the first and last byte of the kernel mapped range. - The kernel range check ends at LINCORE_BASE rather than META_MEMORY_LIMIT, which is ineffective when the kernel is in global space (an extremely uncommon configuration). There are a couple of other shortcomings here too: - Access to the whole of the other address space is permitted (i.e. the global half of the address space when the kernel is in local space). This isn't ideal as it could theoretically still contain privileged mappings set up by the bootloader. - The size argument is unused, permitting user copies which start on valid pages at the end of the user address range and cross the boundary into the kernel address space (e.g. addr = 0x3ffffff0, size > 0x10). It isn't very convenient to add size checks when disallowing certain regions, and it seems far safer to be sure and explicit about what userland is able to access, so invert the logic to allow certain regions instead, and fix the off by 1 errors and missing size checks. This also allows the get_fs() == KERNEL_DS check to be more easily optimised into the user address range case. We now have 3 such allowed regions: - The user address range (incorporating the get_fs() == KERNEL_DS check). - NULL (some kernel code expects this to work, and we'll always catch the fault anyway). - The core code memory region. Fixes: 373cd784d0fc ("metag: Memory handling") Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-metag@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25iommu/vt-d: Flush the IOTLB to get rid of the initial kdump mappingsKarimAllah Ahmed
commit f73a7eee900e95404b61408a23a1df5c5811704c upstream. Ever since commit 091d42e43d ("iommu/vt-d: Copy translation tables from old kernel") the kdump kernel copies the IOMMU context tables from the previous kernel. Each device mappings will be destroyed once the driver for the respective device takes over. This unfortunately breaks the workflow of mapping and unmapping a new context to the IOMMU. The mapping function assumes that either: 1) Unmapping did the proper IOMMU flushing and it only ever flush if the IOMMU unit supports caching invalid entries. 2) The system just booted and the initialization code took care of flushing all IOMMU caches. This assumption is not true for the kdump kernel since the context tables have been copied from the previous kernel and translations could have been cached ever since. So make sure to flush the IOTLB as well when we destroy these old copied mappings. Cc: Joerg Roedel <joro@8bytes.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Fixes: 091d42e43d ("iommu/vt-d: Copy translation tables from old kernel") Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25staging: rtl8192e: rtl92e_get_eeprom_size Fix read size of EPROM_CMD.Malcolm Priestley
commit 90be652c9f157d44b9c2803f902a8839796c090d upstream. EPROM_CMD is 2 byte aligned on PCI map so calling with rtl92e_readl will return invalid data so use rtl92e_readw. The device is unable to select the right eeprom type. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25staging: rtl8192e: fix 2 byte alignment of register BSSIDR.Malcolm Priestley
commit 867510bde14e7b7fc6dd0f50b48f6753cfbd227a upstream. BSSIDR has two byte alignment on PCI ioremap correct the write by swapping to 16 bits first. This fixes a problem that the device associates fail because the filter is not set correctly. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25mm/huge_memory.c: respect FOLL_FORCE/FOLL_COW for thpKeno Fischer
commit 8310d48b125d19fcd9521d83b8293e63eb1646aa upstream. In commit 19be0eaffa3a ("mm: remove gup_flags FOLL_WRITE games from __get_user_pages()"), the mm code was changed from unsetting FOLL_WRITE after a COW was resolved to setting the (newly introduced) FOLL_COW instead. Simultaneously, the check in gup.c was updated to still allow writes with FOLL_FORCE set if FOLL_COW had also been set. However, a similar check in huge_memory.c was forgotten. As a result, remote memory writes to ro regions of memory backed by transparent huge pages cause an infinite loop in the kernel (handle_mm_fault sets FOLL_COW and returns 0 causing a retry, but follow_trans_huge_pmd bails out immidiately because `(flags & FOLL_WRITE) && !pmd_write(*pmd)` is true. While in this state the process is stil SIGKILLable, but little else works (e.g. no ptrace attach, no other signals). This is easily reproduced with the following code (assuming thp are set to always): #include <assert.h> #include <fcntl.h> #include <stdint.h> #include <stdio.h> #include <string.h> #include <sys/mman.h> #include <sys/stat.h> #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> #define TEST_SIZE 5 * 1024 * 1024 int main(void) { int status; pid_t child; int fd = open("/proc/self/mem", O_RDWR); void *addr = mmap(NULL, TEST_SIZE, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, 0, 0); assert(addr != MAP_FAILED); pid_t parent_pid = getpid(); if ((child = fork()) == 0) { void *addr2 = mmap(NULL, TEST_SIZE, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, 0, 0); assert(addr2 != MAP_FAILED); memset(addr2, 'a', TEST_SIZE); pwrite(fd, addr2, TEST_SIZE, (uintptr_t)addr); return 0; } assert(child == waitpid(child, &status, 0)); assert(WIFEXITED(status) && WEXITSTATUS(status) == 0); return 0; } Fix this by updating follow_trans_huge_pmd in huge_memory.c analogously to the update in gup.c in the original commit. The same pattern exists in follow_devmap_pmd. However, we should not be able to reach that check with FOLL_COW set, so add WARN_ONCE to make sure we notice if we ever do. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20170106015025.GA38411@juliacomputing.com Signed-off-by: Keno Fischer <keno@juliacomputing.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Greg Thelen <gthelen@google.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [AmitP: Minor refactoring of upstream changes for linux-3.18.y, where follow_devmap_pmd() doesn't exist.] Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25xc2028: Fix use-after-free bug properlyTakashi Iwai
commit 22a1e7783e173ab3d86018eb590107d68df46c11 upstream. The commit 8dfbcc4351a0 ("[media] xc2028: avoid use after free") tried to address the reported use-after-free by clearing the reference. However, it's clearing the wrong pointer; it sets NULL to priv->ctrl.fname, but it's anyway overwritten by the next line memcpy(&priv->ctrl, p, sizeof(priv->ctrl)). OTOH, the actual code accessing the freed string is the strcmp() call with priv->fname: if (!firmware_name[0] && p->fname && priv->fname && strcmp(p->fname, priv->fname)) free_firmware(priv); where priv->fname points to the previous file name, and this was already freed by kfree(). For fixing the bug properly, this patch does the following: - Keep the copy of firmware file name in only priv->fname, priv->ctrl.fname isn't changed; - The allocation is done only when the firmware gets loaded; - The kfree() is called in free_firmware() commonly Fixes: commit 8dfbcc4351a0 ('[media] xc2028: avoid use after free') Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25arm64: documentation: document tagged pointer stack constraintsKristina Martsenko
commit f0e421b1bf7af97f026e1bb8bfe4c5a7a8c08f42 upstream. Some kernel features don't currently work if a task puts a non-zero address tag in its stack pointer, frame pointer, or frame record entries (FP, LR). For example, with a tagged stack pointer, the kernel can't deliver signals to the process, and the task is killed instead. As another example, with a tagged frame pointer or frame records, perf fails to generate call graphs or resolve symbols. For now, just document these limitations, instead of finding and fixing everything that doesn't work, as it's not known if anyone needs to use tags in these places anyway. In addition, as requested by Dave Martin, generalize the limitations into a general kernel address tag policy, and refactor tagged-pointers.txt to include it. Fixes: d50240a5f6ce ("arm64: mm: permit use of tagged pointers at EL0") Reviewed-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25arm64: uaccess: ensure extension of access_ok() addrMark Rutland
commit a06040d7a791a9177581dcf7293941bd92400856 upstream. Our access_ok() simply hands its arguments over to __range_ok(), which implicitly assummes that the addr parameter is 64 bits wide. This isn't necessarily true for compat code, which might pass down a 32-bit address parameter. In these cases, we don't have a guarantee that the address has been zero extended to 64 bits, and the upper bits of the register may contain unknown values, potentially resulting in a suprious failure. Avoid this by explicitly casting the addr parameter to an unsigned long (as is done on other architectures), ensuring that the parameter is widened appropriately. Fixes: 0aea86a2176c ("arm64: User access library functions") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25arm64: xchg: hazard against entire exchange variableMark Rutland
commit fee960bed5e857eb126c4e56dd9ff85938356579 upstream. The inline assembly in __XCHG_CASE() uses a +Q constraint to hazard against other accesses to the memory location being exchanged. However, the pointer passed to the constraint is a u8 pointer, and thus the hazard only applies to the first byte of the location. GCC can take advantage of this, assuming that other portions of the location are unchanged, as demonstrated with the following test case: union u { unsigned long l; unsigned int i[2]; }; unsigned long update_char_hazard(union u *u) { unsigned int a, b; a = u->i[1]; asm ("str %1, %0" : "+Q" (*(char *)&u->l) : "r" (0UL)); b = u->i[1]; return a ^ b; } unsigned long update_long_hazard(union u *u) { unsigned int a, b; a = u->i[1]; asm ("str %1, %0" : "+Q" (*(long *)&u->l) : "r" (0UL)); b = u->i[1]; return a ^ b; } The linaro 15.08 GCC 5.1.1 toolchain compiles the above as follows when using -O2 or above: 0000000000000000 <update_char_hazard>: 0: d2800001 mov x1, #0x0 // #0 4: f9000001 str x1, [x0] 8: d2800000 mov x0, #0x0 // #0 c: d65f03c0 ret 0000000000000010 <update_long_hazard>: 10: b9400401 ldr w1, [x0,#4] 14: d2800002 mov x2, #0x0 // #0 18: f9000002 str x2, [x0] 1c: b9400400 ldr w0, [x0,#4] 20: 4a000020 eor w0, w1, w0 24: d65f03c0 ret This patch fixes the issue by passing an unsigned long pointer into the +Q constraint, as we do for our cmpxchg code. This may hazard against more than is necessary, but this is better than missing a necessary hazard. Fixes: 305d454aaa29 ("arm64: atomics: implement native {relaxed, acquire, release} atomics") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25ARM: dts: at91: sama5d3_xplained: not all ADC channels are availableLudovic Desroches
commit d3df1ec06353e51fc44563d2e7e18d42811af290 upstream. Remove ADC channels that are not available by default on the sama5d3_xplained board (resistor not populated) in order to not create confusion. Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25ARM: dts: at91: sama5d3_xplained: fix ADC vrefLudovic Desroches
commit 9cdd31e5913c1f86dce7e201b086155b3f24896b upstream. The voltage reference for the ADC is not 3V but 3.3V since it is connected to VDDANA. Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25powerpc/64e: Fix hang when debugging programs with relocated kernelLiuHailong
commit fd615f69a18a9d4aa5ef02a1dc83f319f75da8e7 upstream. Debug interrupts can be taken during interrupt entry, since interrupt entry does not automatically turn them off. The kernel will check whether the faulting instruction is between [interrupt_base_book3e, __end_interrupts], and if so clear MSR[DE] and return. However, when the kernel is built with CONFIG_RELOCATABLE, it can't use LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e) and LOAD_REG_IMMEDIATE(r15,__end_interrupts), as they ignore relocation. Thus, if the kernel is actually running at a different address than it was built at, the address comparison will fail, and the exception entry code will hang at kernel_dbg_exc. r2(toc) is also not usable here, as r2 still holds data from the interrupted context, so LOAD_REG_ADDR() doesn't work either. So we use the *name@got* to get the EV of two labels directly. Test programs test.c shows as follows: int main(int argc, char *argv[]) { if (access("/proc/sys/kernel/perf_event_paranoid", F_OK) == -1) printf("Kernel doesn't have perf_event support\n"); } Steps to reproduce the bug, for example: 1) ./gdb ./test 2) (gdb) b access 3) (gdb) r 4) (gdb) s Signed-off-by: Liu Hailong <liu.hailong6@zte.com.cn> Signed-off-by: Jiang Xuexin <jiang.xuexin@zte.com.cn> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Reviewed-by: Liu Song <liu.song11@zte.com.cn> Reviewed-by: Huang Jian <huang.jian@zte.com.cn> [scottwood: cleaned up commit message, and specified bad behavior as a hang rather than an oops to correspond to mainline kernel behavior] Fixes: 1cb6e0649248 ("powerpc/book3e: support CONFIG_RELOCATABLE") Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25powerpc/pseries: Fix of_node_put() underflow during DLPAR removeTyrel Datwyler
commit 68baf692c435339e6295cb470ea5545cbc28160e upstream. Historically struct device_node references were tracked using a kref embedded as a struct field. Commit 75b57ecf9d1d ("of: Make device nodes kobjects so they show up in sysfs") (Mar 2014) refactored device_nodes to be kobjects such that the device tree could by more simply exposed to userspace using sysfs. Commit 0829f6d1f69e ("of: device_node kobject lifecycle fixes") (Mar 2014) followed up these changes to better control the kobject lifecycle and in particular the referecne counting via of_node_get(), of_node_put(), and of_node_init(). A result of this second commit was that it introduced an of_node_put() call when a dynamic node is detached, in of_node_remove(), that removes the initial kobj reference created by of_node_init(). Traditionally as the original dynamic device node user the pseries code had assumed responsibilty for releasing this final reference in its platform specific DLPAR detach code. This patch fixes a refcount underflow introduced by commit 0829f6d1f6, and recently exposed by the upstreaming of the recount API. Messages like the following are no longer seen in the kernel log with this patch following DLPAR remove operations of cpus and pci devices. rpadlpar_io: slot PHB 72 removed refcount_t: underflow; use-after-free. ------------[ cut here ]------------ WARNING: CPU: 5 PID: 3335 at lib/refcount.c:128 refcount_sub_and_test+0xf4/0x110 Fixes: 0829f6d1f69e ("of: device_node kobject lifecycle fixes") Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> [mpe: Make change log commit references more verbose] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25powerpc/book3s/mce: Move add_taint() later in virtual modeMahesh Salgaonkar
commit d93b0ac01a9ce276ec39644be47001873d3d183c upstream. machine_check_early() gets called in real mode. The very first time when add_taint() is called, it prints a warning which ends up calling opal call (that uses OPAL_CALL wrapper) for writing it to console. If we get a very first machine check while we are in opal we are doomed. OPAL_CALL overwrites the PACASAVEDMSR in r13 and in this case when we are done with MCE handling the original opal call will use this new MSR on it's way back to opal_return. This usually leads to unexpected behaviour or the kernel to panic. Instead move the add_taint() call later in the virtual mode where it is safe to call. This is broken with current FW level. We got lucky so far for not getting very first MCE hit while in OPAL. But easily reproducible on Mambo. Fixes: 27ea2c420cad ("powerpc: Set the correct kernel taint on machine check errors.") Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25cx231xx-cards: fix NULL-deref at probeJohan Hovold
commit 0cd273bb5e4d1828efaaa8dfd11b7928131ed149 upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer or accessing memory beyond the endpoint array should a malicious device lack the expected endpoints. Fixes: e0d3bafd0258 ("V4L/DVB (10954): Add cx231xx USB driver") Cc: Sri Deevi <Srinivasa.Deevi@conexant.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25cx231xx-audio: fix NULL-deref at probeJohan Hovold
commit 65f921647f4c89a2068478c89691f39b309b58f7 upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer or accessing memory beyond the endpoint array should a malicious device lack the expected endpoints. Fixes: e0d3bafd0258 ("V4L/DVB (10954): Add cx231xx USB driver") Cc: Sri Deevi <Srinivasa.Deevi@conexant.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25cx231xx-audio: fix init error pathJohan Hovold
commit fff1abc4d54e469140a699612b4db8d6397bfcba upstream. Make sure to release the snd_card also on a late allocation error. Fixes: e0d3bafd0258 ("V4L/DVB (10954): Add cx231xx USB driver") Cc: Sri Deevi <Srinivasa.Deevi@conexant.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>