summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2017-08-08Merge "Merge android-4.4@59ff2e1 (v4.4.78) into msm-4.4"Linux Build Service Account
2017-08-04Merge android-4.4@59ff2e1 (v4.4.78) into msm-4.4Blagovest Kolenichev
* refs/heads/tmp-59ff2e1 Linux 4.4.78 kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS kvm: vmx: Check value written to IA32_BNDCFGS kvm: x86: Guest BNDCFGS requires guest MPX support kvm: vmx: Do not disable intercepts for BNDCFGS KVM: x86: disable MPX if host did not enable MPX XSAVE features tracing: Use SOFTIRQ_OFFSET for softirq dectection for more accurate results PM / QoS: return -EINVAL for bogus strings PM / wakeirq: Convert to SRCU sched/topology: Optimize build_group_mask() sched/topology: Fix overlapping sched_group_mask crypto: caam - fix signals handling crypto: sha1-ssse3 - Disable avx2 crypto: atmel - only treat EBUSY as transient if backlog crypto: talitos - Extend max key length for SHA384/512-HMAC and AEAD mm: fix overflow check in expand_upwards() tpm: Issue a TPM2_Shutdown for TPM2 devices. Add "shutdown" to "struct class". tpm: Provide strong locking for device removal tpm: Get rid of chip->pdev selftests/capabilities: Fix the test_execve test mnt: Make propagate_umount less slow for overlapping mount propagation trees mnt: In propgate_umount handle visiting mounts in any order mnt: In umount propagation reparent in a separate pass vt: fix unchecked __put_user() in tioclinux ioctls exec: Limit arg stack to at most 75% of _STK_LIM s390: reduce ELF_ET_DYN_BASE powerpc: move ELF_ET_DYN_BASE to 4GB / 4MB arm64: move ELF_ET_DYN_BASE to 4GB / 4MB arm: move ELF_ET_DYN_BASE to 4MB binfmt_elf: use ELF_ET_DYN_BASE only for PIE checkpatch: silence perl 5.26.0 unescaped left brace warnings fs/dcache.c: fix spin lockup issue on nlru->lock mm/list_lru.c: fix list_lru_count_node() to be race free kernel/extable.c: mark core_kernel_text notrace tools/lib/lockdep: Reduce MAX_LOCK_DEPTH to avoid overflowing lock_chain/: Depth parisc/mm: Ensure IRQs are off in switch_mm() parisc: DMA API: return error instead of BUG_ON for dma ops on non dma devs parisc: use compat_sys_keyctl() parisc: Report SIGSEGV instead of SIGBUS when running out of stack irqchip/gic-v3: Fix out-of-bound access in gic_set_affinity cfg80211: Check if PMKID attribute is of expected size cfg80211: Validate frequencies nested in NL80211_ATTR_SCAN_FREQUENCIES cfg80211: Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE brcmfmac: fix possible buffer overflow in brcmf_cfg80211_mgmt_tx() rds: tcp: use sock_create_lite() to create the accept socket vrf: fix bug_on triggered by rx when destroying a vrf net: ipv6: Compare lwstate in detecting duplicate nexthops ipv6: dad: don't remove dynamic addresses if link is down net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish() bpf: prevent leaking pointer via xadd on unpriviledged net: prevent sign extension in dev_get_stats() tcp: reset sk_rx_dst in tcp_disconnect() net: dp83640: Avoid NULL pointer dereference. ipv6: avoid unregistering inet6_dev for loopback net/phy: micrel: configure intterupts after autoneg workaround net: sched: Fix one possible panic when no destroy callback net_sched: fix error recovery at qdisc creation ANDROID: android-verity: mark dev as rw for linear target ANDROID: sdcardfs: Remove unnecessary lock ANDROID: binder: don't check prio permissions on restore. Add BINDER_GET_NODE_DEBUG_INFO ioctl UPSTREAM: cpufreq: schedutil: Trace frequency only if it has changed UPSTREAM: cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely UPSTREAM: cpufreq: schedutil: Refactor sugov_next_freq_shared() UPSTREAM: cpufreq: schedutil: Fix per-CPU structure initialization in sugov_start() UPSTREAM: cpufreq: schedutil: Pass sg_policy to get_next_freq() UPSTREAM: cpufreq: schedutil: move cached_raw_freq to struct sugov_policy UPSTREAM: cpufreq: schedutil: Rectify comment in sugov_irq_work() function UPSTREAM: cpufreq: schedutil: irq-work and mutex are only used in slow path UPSTREAM: cpufreq: schedutil: enable fast switch earlier UPSTREAM: cpufreq: schedutil: Avoid indented labels Linux 4.4.77 saa7134: fix warm Medion 7134 EEPROM read x86/mm/pat: Don't report PAT on CPUs that don't support it ext4: check return value of kstrtoull correctly in reserved_clusters_store staging: comedi: fix clean-up of comedi_class in comedi_init() staging: vt6556: vnt_start Fix missing call to vnt_key_init_table. tcp: fix tcp_mark_head_lost to check skb len before fragmenting md: fix super_offset endianness in super_1_rdev_size_change md: fix incorrect use of lexx_to_cpu in does_sb_need_changing perf tools: Use readdir() instead of deprecated readdir_r() again perf tests: Remove wrong semicolon in while loop in CQM test perf trace: Do not process PERF_RECORD_LOST twice perf dwarf: Guard !x86_64 definitions under #ifdef else clause perf pmu: Fix misleadingly indented assignment (whitespace) perf annotate browser: Fix behaviour of Shift-Tab with nothing focussed perf tools: Remove duplicate const qualifier perf script: Use readdir() instead of deprecated readdir_r() perf thread_map: Use readdir() instead of deprecated readdir_r() perf tools: Use readdir() instead of deprecated readdir_r() perf bench numa: Avoid possible truncation when using snprintf() perf tests: Avoid possible truncation with dirent->d_name + snprintf perf scripting perl: Fix compile error with some perl5 versions perf thread_map: Correctly size buffer used with dirent->dt_name perf intel-pt: Use __fallthrough perf top: Use __fallthrough tools strfilter: Use __fallthrough tools string: Use __fallthrough in perf_atoll() tools include: Add a __fallthrough statement mqueue: fix a use-after-free in sys_mq_notify() RDMA/uverbs: Check port number supplied by user verbs cmds KEYS: Fix an error code in request_master_key() ath10k: override CE5 config for QCA9377 x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings x86/tools: Fix gcc-7 warning in relocs.c gfs2: Fix glock rhashtable rcu bug USB: serial: qcserial: new Sierra Wireless EM7305 device ID USB: serial: option: add two Longcheer device ids pinctrl: sh-pfc: Update info pointer after SoC-specific init pinctrl: mxs: atomically switch mux and drive strength config pinctrl: sunxi: Fix SPDIF function name for A83T pinctrl: meson: meson8b: fix the NAND DQS pins pinctrl: sh-pfc: r8a7791: Fix SCIF2 pinmux data sysctl: report EINVAL if value is larger than UINT_MAX for proc_douintvec sysctl: don't print negative flag for proc_douintvec mac80211_hwsim: Replace bogus hrtimer clockid usb: Fix typo in the definition of Endpoint[out]Request usb: usbip: set buffer pointers to NULL after free Add USB quirk for HVR-950q to avoid intermittent device resets USB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick usb: dwc3: replace %p with %pK drm/virtio: don't leak bo on drm_gem_object_init failure tracing/kprobes: Allow to create probe with a module name starting with a digit mm: fix classzone_idx underflow in shrink_zones() bgmac: reset & enable Ethernet core before using it driver core: platform: fix race condition with driver_override fs: completely ignore unknown open flags fs: add a VALID_OPEN_FLAGS ANDROID: binder: add RT inheritance flag to node. ANDROID: binder: improve priority inheritance. ANDROID: binder: add min sched_policy to node. ANDROID: binder: add support for RT prio inheritance. ANDROID: binder: push new transactions to waiting threads. ANDROID: binder: remove proc waitqueue FROMLIST: binder: remove global binder lock FROMLIST: binder: fix death race conditions FROMLIST: binder: protect against stale pointers in print_binder_transaction FROMLIST: binder: protect binder_ref with outer lock FROMLIST: binder: use inner lock to protect thread accounting FROMLIST: binder: protect transaction_stack with inner lock. FROMLIST: binder: protect proc->threads with inner_lock FROMLIST: binder: protect proc->nodes with inner lock FROMLIST: binder: add spinlock to protect binder_node FROMLIST: binder: add spinlocks to protect todo lists FROMLIST: binder: use inner lock to sync work dq and node counts FROMLIST: binder: introduce locking helper functions FROMLIST: binder: use node->tmp_refs to ensure node safety FROMLIST: binder: refactor binder ref inc/dec for thread safety FROMLIST: binder: make sure accesses to proc/thread are safe FROMLIST: binder: make sure target_node has strong ref FROMLIST: binder: guarantee txn complete / errors delivered in-order FROMLIST: binder: refactor binder_pop_transaction FROMLIST: binder: use atomic for transaction_log index FROMLIST: binder: add more debug info when allocation fails. FROMLIST: binder: protect against two threads freeing buffer FROMLIST: binder: remove dead code in binder_get_ref_for_node FROMLIST: binder: don't modify thread->looper from other threads FROMLIST: binder: avoid race conditions when enqueuing txn FROMLIST: binder: refactor queue management in binder_thread_read FROMLIST: binder: add log information for binder transaction failures FROMLIST: binder: make binder_last_id an atomic FROMLIST: binder: change binder_stats to atomics FROMLIST: binder: add protection for non-perf cases FROMLIST: binder: remove binder_debug_no_lock mechanism FROMLIST: binder: move binder_alloc to separate file FROMLIST: binder: separate out binder_alloc functions FROMLIST: binder: remove unneeded cleanup code FROMLIST: binder: separate binder allocator structure from binder proc FROMLIST: binder: Use wake up hint for synchronous transactions. Revert "android: binder: move global binder state into context struct." sched: walt: fix window misalignment when HZ=300 ANDROID: android-base.cfg: remove CONFIG_CGROUP_DEBUG ANDROID: sdcardfs: use mount_nodev and fix a issue in sdcardfs_kill_sb Conflicts: drivers/android/binder.c drivers/net/wireless/ath/ath10k/pci.c Change-Id: Ic6f82c2ec9929733a16a03bb3b745187e002f4f6 Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2017-08-03Merge "perf/core: Fix crash in perf_event_read()"Linux Build Service Account
2017-08-03Merge "sched: avoid RT tasks contention during sched boost"Linux Build Service Account
2017-08-02perf/core: Fix crash in perf_event_read()Peter Zijlstra
Alexei had his box explode because doing read() on a package (rapl/uncore) event that isn't currently scheduled in ends up doing an out-of-bounds load. Rework the code to more explicitly deal with event->oncpu being -1. Author: Peter Zijlstra (Intel) <peterz@infradead.org> Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Tested-by: David Carrillo-Cisneros <davidcc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: eranian@google.com Fixes: d6a2f9035bfc ("perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG") Signed-off-by: Ingo Molnar <mingo@kernel.org> Git-commit: 451d24d1e5f40bad000fa9abe36ddb16fc9928cb Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git [pfay@codeaurora.org: apply the event->oncpu validity check from from the patch. Other code from the patch calls routines not yet in 4.4 so omit that part of patch. This code fixes segfault crashes during reboot where the event->oncpu value is -1. Change-Id: I040f0af2030e53ac3329e4b3a1bbcd37f080cdcf Signed-off-by: Patrick Fay <pfay@codeaurora.org>
2017-07-31Revert "perf: stop deadlock if attempt to bring cpu up fails"Imran Khan
This reverts 'commit 5f71e693df3a ("perf: stop deadlock if attempt to bring cpu up fails")' as this change is not needed. Change-Id: I17e6f7c1b648a5f2559eeea786efafc9be32a9e9 Signed-off-by: Imran Khan <kimran@codeaurora.org>
2017-07-27rwsem: fix missed wakeup due to reordering of loadPrateek Sood
If a spinner is present, there is a chance that the load of rwsem_has_spinner() in rwsem_wake() can be reordered with respect to decrement of rwsem count in __up_write() leading to wakeup being missed. spinning writer up_write caller --------------- ----------------------- [S] osq_unlock() [L] osq spin_lock(wait_lock) sem->count=0xFFFFFFFF00000001 +0xFFFFFFFF00000000 count=sem->count MB sem->count=0xFFFFFFFE00000001 -0xFFFFFFFF00000001 RMB spin_trylock(wait_lock) return rwsem_try_write_lock(count) spin_unlock(wait_lock) schedule() Reordering of atomic_long_sub_return_release() in __up_write() and rwsem_has_spinner() in rwsem_wake() can cause missing of wakeup in up_write() context. In spinning writer, sem->count and local variable count is 0XFFFFFFFE00000001. It would result in rwsem_try_write_lock() failing to acquire rwsem and spinning writer going to sleep in rwsem_down_write_failed(). The smp_rmb() will make sure that the spinner state is consulted after sem->count is updated in up_write context. Change-Id: I96de9a65adedb35d1ee2c6c36dc7759c9b8f5d4d Signed-off-by: Prateek Sood <prsood@codeaurora.org>
2017-07-25hotplug cpu: ratelimit logs for thermal vetoPrateek Sood
Thermal notifier callback is not allowing CPU to come online. Rate limit logs to avoid watchdog non-secure bite as it is a valid rejection due to high temperature of SOC. Change-Id: If3f8df7370e6ffd18b50e7451431d6a26023359d Signed-off-by: Prateek Sood <prsood@codeaurora.org>
2017-07-21Merge 4.4.78 into android-4.4Greg Kroah-Hartman
Changes in 4.4.78 net_sched: fix error recovery at qdisc creation net: sched: Fix one possible panic when no destroy callback net/phy: micrel: configure intterupts after autoneg workaround ipv6: avoid unregistering inet6_dev for loopback net: dp83640: Avoid NULL pointer dereference. tcp: reset sk_rx_dst in tcp_disconnect() net: prevent sign extension in dev_get_stats() bpf: prevent leaking pointer via xadd on unpriviledged net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish() ipv6: dad: don't remove dynamic addresses if link is down net: ipv6: Compare lwstate in detecting duplicate nexthops vrf: fix bug_on triggered by rx when destroying a vrf rds: tcp: use sock_create_lite() to create the accept socket brcmfmac: fix possible buffer overflow in brcmf_cfg80211_mgmt_tx() cfg80211: Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE cfg80211: Validate frequencies nested in NL80211_ATTR_SCAN_FREQUENCIES cfg80211: Check if PMKID attribute is of expected size irqchip/gic-v3: Fix out-of-bound access in gic_set_affinity parisc: Report SIGSEGV instead of SIGBUS when running out of stack parisc: use compat_sys_keyctl() parisc: DMA API: return error instead of BUG_ON for dma ops on non dma devs parisc/mm: Ensure IRQs are off in switch_mm() tools/lib/lockdep: Reduce MAX_LOCK_DEPTH to avoid overflowing lock_chain/: Depth kernel/extable.c: mark core_kernel_text notrace mm/list_lru.c: fix list_lru_count_node() to be race free fs/dcache.c: fix spin lockup issue on nlru->lock checkpatch: silence perl 5.26.0 unescaped left brace warnings binfmt_elf: use ELF_ET_DYN_BASE only for PIE arm: move ELF_ET_DYN_BASE to 4MB arm64: move ELF_ET_DYN_BASE to 4GB / 4MB powerpc: move ELF_ET_DYN_BASE to 4GB / 4MB s390: reduce ELF_ET_DYN_BASE exec: Limit arg stack to at most 75% of _STK_LIM vt: fix unchecked __put_user() in tioclinux ioctls mnt: In umount propagation reparent in a separate pass mnt: In propgate_umount handle visiting mounts in any order mnt: Make propagate_umount less slow for overlapping mount propagation trees selftests/capabilities: Fix the test_execve test tpm: Get rid of chip->pdev tpm: Provide strong locking for device removal Add "shutdown" to "struct class". tpm: Issue a TPM2_Shutdown for TPM2 devices. mm: fix overflow check in expand_upwards() crypto: talitos - Extend max key length for SHA384/512-HMAC and AEAD crypto: atmel - only treat EBUSY as transient if backlog crypto: sha1-ssse3 - Disable avx2 crypto: caam - fix signals handling sched/topology: Fix overlapping sched_group_mask sched/topology: Optimize build_group_mask() PM / wakeirq: Convert to SRCU PM / QoS: return -EINVAL for bogus strings tracing: Use SOFTIRQ_OFFSET for softirq dectection for more accurate results KVM: x86: disable MPX if host did not enable MPX XSAVE features kvm: vmx: Do not disable intercepts for BNDCFGS kvm: x86: Guest BNDCFGS requires guest MPX support kvm: vmx: Check value written to IA32_BNDCFGS kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS Linux 4.4.78 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-07-21tracing: Use SOFTIRQ_OFFSET for softirq dectection for more accurate resultsPavankumar Kondeti
commit c59f29cb144a6a0dfac16ede9dc8eafc02dc56ca upstream. The 's' flag is supposed to indicate that a softirq is running. This can be detected by testing the preempt_count with SOFTIRQ_OFFSET. The current code tests the preempt_count with SOFTIRQ_MASK, which would be true even when softirqs are disabled but not serving a softirq. Link: http://lkml.kernel.org/r/1481300417-3564-1-git-send-email-pkondeti@codeaurora.org Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21sched/topology: Optimize build_group_mask()Lauro Ramos Venancio
commit f32d782e31bf079f600dcec126ed117b0577e85c upstream. The group mask is always used in intersection with the group CPUs. So, when building the group mask, we don't have to care about CPUs that are not part of the group. Signed-off-by: Lauro Ramos Venancio <lvenanci@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: lwang@redhat.com Cc: riel@redhat.com Link: http://lkml.kernel.org/r/1492717903-5195-2-git-send-email-lvenanci@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21sched/topology: Fix overlapping sched_group_maskPeter Zijlstra
commit 73bb059f9b8a00c5e1bf2f7ca83138c05d05e600 upstream. The point of sched_group_mask is to select those CPUs from sched_group_cpus that can actually arrive at this balance domain. The current code gets it wrong, as can be readily demonstrated with a topology like: node 0 1 2 3 0: 10 20 30 20 1: 20 10 20 30 2: 30 20 10 20 3: 20 30 20 10 Where (for example) domain 1 on CPU1 ends up with a mask that includes CPU0: [] CPU1 attaching sched-domain: [] domain 0: span 0-2 level NUMA [] groups: 1 (mask: 1), 2, 0 [] domain 1: span 0-3 level NUMA [] groups: 0-2 (mask: 0-2) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072) This causes sched_balance_cpu() to compute the wrong CPU and consequently should_we_balance() will terminate early resulting in missed load-balance opportunities. The fixed topology looks like: [] CPU1 attaching sched-domain: [] domain 0: span 0-2 level NUMA [] groups: 1 (mask: 1), 2, 0 [] domain 1: span 0-3 level NUMA [] groups: 0-2 (mask: 1) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072) (note: this relies on OVERLAP domains to always have children, this is true because the regular topology domains are still here -- this is before degenerate trimming) Debugged-by: Lauro Ramos Venancio <lvenanci@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Fixes: e3589f6c81e4 ("sched: Allow for overlapping sched_domain spans") Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21kernel/extable.c: mark core_kernel_text notraceMarcin Nowakowski
commit c0d80ddab89916273cb97114889d3f337bc370ae upstream. core_kernel_text is used by MIPS in its function graph trace processing, so having this method traced leads to an infinite set of recursive calls such as: Call Trace: ftrace_return_to_handler+0x50/0x128 core_kernel_text+0x10/0x1b8 prepare_ftrace_return+0x6c/0x114 ftrace_graph_caller+0x20/0x44 return_to_handler+0x10/0x30 return_to_handler+0x0/0x30 return_to_handler+0x0/0x30 ftrace_ops_no_ops+0x114/0x1bc core_kernel_text+0x10/0x1b8 core_kernel_text+0x10/0x1b8 core_kernel_text+0x10/0x1b8 ftrace_ops_no_ops+0x114/0x1bc core_kernel_text+0x10/0x1b8 prepare_ftrace_return+0x6c/0x114 ftrace_graph_caller+0x20/0x44 (...) Mark the function notrace to avoid it being traced. Link: http://lkml.kernel.org/r/1498028607-6765-1-git-send-email-marcin.nowakowski@imgtec.com Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Meyer <thomas@m3y3r.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21bpf: prevent leaking pointer via xadd on unpriviledgedDaniel Borkmann
commit 6bdf6abc56b53103324dfd270a86580306e1a232 upstream. Leaking kernel addresses on unpriviledged is generally disallowed, for example, verifier rejects the following: 0: (b7) r0 = 0 1: (18) r2 = 0xffff897e82304400 3: (7b) *(u64 *)(r1 +48) = r2 R2 leaks addr into ctx Doing pointer arithmetic on them is also forbidden, so that they don't turn into unknown value and then get leaked out. However, there's xadd as a special case, where we don't check the src reg for being a pointer register, e.g. the following will pass: 0: (b7) r0 = 0 1: (7b) *(u64 *)(r1 +48) = r0 2: (18) r2 = 0xffff897e82304400 ; map 4: (db) lock *(u64 *)(r1 +48) += r2 5: (95) exit We could store the pointer into skb->cb, loose the type context, and then read it out from there again to leak it eventually out of a map value. Or more easily in a different variant, too: 0: (bf) r6 = r1 1: (7a) *(u64 *)(r10 -8) = 0 2: (bf) r2 = r10 3: (07) r2 += -8 4: (18) r1 = 0x0 6: (85) call bpf_map_lookup_elem#1 7: (15) if r0 == 0x0 goto pc+3 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R6=ctx R10=fp 8: (b7) r3 = 0 9: (7b) *(u64 *)(r0 +0) = r3 10: (db) lock *(u64 *)(r0 +0) += r6 11: (b7) r0 = 0 12: (95) exit from 7 to 11: R0=inv,min_value=0,max_value=0 R6=ctx R10=fp 11: (b7) r0 = 0 12: (95) exit Prevent this by checking xadd src reg for pointer types. Also add a couple of test cases related to this. Fixes: 1be7f75d1668 ("bpf: enable non-root eBPF programs") Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-18UPSTREAM: cpufreq: schedutil: Trace frequency only if it has changedChris Redpath
sugov_update_commit() calls trace_cpu_frequency() to record the current CPU frequency if it has not changed in the fast switch case to prevent utilities from getting confused (they may report that the CPU is idle if the frequency has not been recorded for too long, for example). However, that may cause the tracepoint to be triggered quite often for no real reason (if the frequency doesn't change, we will not modify the last update time stamp and governor computations may run again shortly when that happens), so don't do that (arguably, it is done to work around a utilities bug anyway). That allows code duplication in sugov_update_commit() to be reduced somewhat too. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> (cherry picked from commit 38d4ea229d25d30be6bf41bcd6cd663a587866ca) (conflicts with sugov_up_down_rate_limit resolved) Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: Ia019dda29b8c1c4cf3553da75c88d066eb5674e9
2017-07-18UPSTREAM: cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurelyChris Redpath
The way the schedutil governor uses the PELT metric causes it to underestimate the CPU utilization in some cases. That can be easily demonstrated by running kernel compilation on a Sandy Bridge Intel processor, running turbostat in parallel with it and looking at the values written to the MSR_IA32_PERF_CTL register. Namely, the expected result would be that when all CPUs were 100% busy, all of them would be requested to run in the maximum P-state, but observation shows that this clearly isn't the case. The CPUs run in the maximum P-state for a while and then are requested to run slower and go back to the maximum P-state after a while again. That causes the actual frequency of the processor to visibly oscillate below the sustainable maximum in a jittery fashion which clearly is not desirable. That has been attributed to CPU utilization metric updates on task migration that cause the total utilization value for the CPU to be reduced by the utilization of the migrated task. If that happens, the schedutil governor may see a CPU utilization reduction and will attempt to reduce the CPU frequency accordingly right away. That may be premature, though, for example if the system is generally busy and there are other runnable tasks waiting to be run on that CPU already. This is unlikely to be an issue on systems where cpufreq policies are shared between multiple CPUs, because in those cases the policy utilization is computed as the maximum of the CPU utilization values over the whole policy and if that turns out to be low, reducing the frequency for the policy most likely is a good idea anyway. On systems with one CPU per policy, however, it may affect performance adversely and even lead to increased energy consumption in some cases. On those systems it may be addressed by taking another utilization metric into consideration, like whether or not the CPU whose frequency is about to be reduced has been idle recently, because if that's not the case, the CPU is likely to be busy in the near future and its frequency should not be reduced. To that end, use the counter of idle calls in the timekeeping code. Namely, make the schedutil governor look at that counter for the current CPU every time before its frequency is about to be reduced. If the counter has not changed since the previous iteration of the governor computations for that CPU, the CPU has been busy for all that time and its frequency should not be decreased, so if the new frequency would be lower than the one set previously, the governor will skip the frequency update. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Joel Fernandes <joelaf@google.com> (cherry picked from commit b7eaf1aab9f8bd2e49fceed77ebc66c1b5800718) (simple CPUFREQ_RT_DL vs CPUFREQ_DL usage conflicts) Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: I531ec02c052944ee07a904dc2a25c59948ee762b
2017-07-18UPSTREAM: cpufreq: schedutil: Refactor sugov_next_freq_shared()Chris Redpath
The loop in sugov_next_freq_shared() contains an if block to skip the loop for the current CPU. This turns out to be an unnecessary conditional in the scheduler's hot-path for every CPU in the policy. It would be better to drop the conditional and make the loop treat all the CPUs in the same way. That would eliminate the need of calling sugov_iowait_boost() at the top of the routine. To keep the code optimized to return early if the current CPU has RT/DL flags set, move the flags check to sugov_update_shared() instead in order to avoid the function call entirely. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit cba1dfb57b94c234728b689d9b00d4267fa1a879) (modified for SCHED_CPUFREQ_DL vs SCHED_CPUFREQ_RT) Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: Ie046fdc8eda46821356750edd0fb6f7d077af363
2017-07-18UPSTREAM: cpufreq: schedutil: Fix per-CPU structure initialization in ↵Chris Redpath
sugov_start() sugov_start() only initializes struct sugov_cpu per-CPU structures for shared policies, but it should do that for single-CPU policies too. That in particular makes the IO-wait boost mechanism work in the cases when cpufreq policies correspond to individual CPUs. Fixes: 21ca6d2c52f8 (cpufreq: schedutil: Add iowait boosting) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.9+ <stable@vger.kernel.org> # 4.9+ (cherry picked from commit 4296f23ed49a15d36949458adcc66ff993dee2a8) (we use SCHED_CPUFREQ_DL instead of SCHED_CPUFREQ_RT in cpu->flags) Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: I5b837a0ee4432115d85caa1a9808ea61e1e1b07f
2017-07-18UPSTREAM: cpufreq: schedutil: Pass sg_policy to get_next_freq()Viresh Kumar
get_next_freq() uses sg_cpu only to get sg_policy, which the callers of get_next_freq() already have. Pass sg_policy instead of sg_cpu to get_next_freq(), to make it more efficient. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit 655cb1ebff4b7918fc560502c3297af2d3c7d114) Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: Ia210058da32930a6cdb18258aa679cd1a44a747e
2017-07-18UPSTREAM: cpufreq: schedutil: move cached_raw_freq to struct sugov_policyChris Redpath
cached_raw_freq applies to the entire cpufreq policy and not individual CPUs. Apart from wasting per-cpu memory, it is actually wrong to keep it in struct sugov_cpu as we may end up comparing next_freq with a stale cached_raw_freq of a random CPU. Move cached_raw_freq to struct sugov_policy. Fixes: 5cbea46984d6 (cpufreq: schedutil: map raw required frequency to driver frequency) Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry-picked from 6c4f0fa643cb9e775dcc976e3db00d649468ff1d) Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: Ie91420f710819b383947f9031da9be1f3bb7f636
2017-07-18UPSTREAM: cpufreq: schedutil: Rectify comment in sugov_irq_work() functionViresh Kumar
This patch rectifies a comment present in sugov_irq_work() function to follow proper grammar. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit d06e622d3d9206e6a2cc45a0f9a3256da8773ff4) Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: Iaf996445d411725639d511432cc424086892a146
2017-07-18UPSTREAM: cpufreq: schedutil: irq-work and mutex are only used in slow pathChris Redpath
Execute the irq-work specific initialization/exit code only when the fast path isn't available. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit 21ef57297b15a49b0c4dd4e7135c1a08e9a29a1c) Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: Icfd68f455ef71846d799fcd2d8ec6aa1bf59573e
2017-07-18UPSTREAM: cpufreq: schedutil: enable fast switch earlierChris Redpath
The fast_switch_enabled flag will be used by both sugov_policy_alloc() and sugov_policy_free() with a later patch. Prepare for that by moving the calls to enable and disable it to the beginning of sugov_init() and end of sugov_exit(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit 4a71ce4348bb61740d411822357061f8bf870f4c) Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: Ia174f423ca02d59360657ac2e77a5098ce5cf99c
2017-07-18UPSTREAM: cpufreq: schedutil: Avoid indented labelsChris Redpath
Switch to the more common practice of writing labels. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit 8e2ddb03643eb9d0bc4926946d7ce0d308eef0a5) Signed-off-by: Chris Redpath <chris.redpath@arm.com> Change-Id: Ida75c99cf3dff5cae24d3866454c83bcdb3385b9
2017-07-17Merge "Revert "sched: Remove synchronize rcu/sched calls from _cpu_down""Linux Build Service Account
2017-07-15Merge 4.4.77 into android-4.4Greg Kroah-Hartman
Changes in 4.4.77 fs: add a VALID_OPEN_FLAGS fs: completely ignore unknown open flags driver core: platform: fix race condition with driver_override bgmac: reset & enable Ethernet core before using it mm: fix classzone_idx underflow in shrink_zones() tracing/kprobes: Allow to create probe with a module name starting with a digit drm/virtio: don't leak bo on drm_gem_object_init failure usb: dwc3: replace %p with %pK USB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick Add USB quirk for HVR-950q to avoid intermittent device resets usb: usbip: set buffer pointers to NULL after free usb: Fix typo in the definition of Endpoint[out]Request mac80211_hwsim: Replace bogus hrtimer clockid sysctl: don't print negative flag for proc_douintvec sysctl: report EINVAL if value is larger than UINT_MAX for proc_douintvec pinctrl: sh-pfc: r8a7791: Fix SCIF2 pinmux data pinctrl: meson: meson8b: fix the NAND DQS pins pinctrl: sunxi: Fix SPDIF function name for A83T pinctrl: mxs: atomically switch mux and drive strength config pinctrl: sh-pfc: Update info pointer after SoC-specific init USB: serial: option: add two Longcheer device ids USB: serial: qcserial: new Sierra Wireless EM7305 device ID gfs2: Fix glock rhashtable rcu bug x86/tools: Fix gcc-7 warning in relocs.c x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings ath10k: override CE5 config for QCA9377 KEYS: Fix an error code in request_master_key() RDMA/uverbs: Check port number supplied by user verbs cmds mqueue: fix a use-after-free in sys_mq_notify() tools include: Add a __fallthrough statement tools string: Use __fallthrough in perf_atoll() tools strfilter: Use __fallthrough perf top: Use __fallthrough perf intel-pt: Use __fallthrough perf thread_map: Correctly size buffer used with dirent->dt_name perf scripting perl: Fix compile error with some perl5 versions perf tests: Avoid possible truncation with dirent->d_name + snprintf perf bench numa: Avoid possible truncation when using snprintf() perf tools: Use readdir() instead of deprecated readdir_r() perf thread_map: Use readdir() instead of deprecated readdir_r() perf script: Use readdir() instead of deprecated readdir_r() perf tools: Remove duplicate const qualifier perf annotate browser: Fix behaviour of Shift-Tab with nothing focussed perf pmu: Fix misleadingly indented assignment (whitespace) perf dwarf: Guard !x86_64 definitions under #ifdef else clause perf trace: Do not process PERF_RECORD_LOST twice perf tests: Remove wrong semicolon in while loop in CQM test perf tools: Use readdir() instead of deprecated readdir_r() again md: fix incorrect use of lexx_to_cpu in does_sb_need_changing md: fix super_offset endianness in super_1_rdev_size_change tcp: fix tcp_mark_head_lost to check skb len before fragmenting staging: vt6556: vnt_start Fix missing call to vnt_key_init_table. staging: comedi: fix clean-up of comedi_class in comedi_init() ext4: check return value of kstrtoull correctly in reserved_clusters_store x86/mm/pat: Don't report PAT on CPUs that don't support it saa7134: fix warm Medion 7134 EEPROM read Linux 4.4.77 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-07-15sysctl: report EINVAL if value is larger than UINT_MAX for proc_douintvecLiping Zhang
commit 425fffd886bae3d127a08fa6a17f2e31e24ed7ff upstream. Currently, inputting the following command will succeed but actually the value will be truncated: # echo 0x12ffffffff > /proc/sys/net/ipv4/tcp_notsent_lowat This is not friendly to the user, so instead, we should report error when the value is larger than UINT_MAX. Fixes: e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-15sysctl: don't print negative flag for proc_douintvecLiping Zhang
commit 5380e5644afbba9e3d229c36771134976f05c91e upstream. I saw some very confusing sysctl output on my system: # cat /proc/sys/net/core/xfrm_aevent_rseqth -2 # cat /proc/sys/net/core/xfrm_aevent_etime -10 # cat /proc/sys/net/ipv4/tcp_notsent_lowat -4294967295 Because we forget to set the *negp flag in proc_douintvec, so it will become a garbage value. Since the value related to proc_douintvec is always an unsigned integer, so we can set *negp to false explictily to fix this issue. Fixes: e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-15tracing/kprobes: Allow to create probe with a module name starting with a digitSabrina Dubroca
commit 9e52b32567126fe146f198971364f68d3bc5233f upstream. Always try to parse an address, since kstrtoul() will safely fail when given a symbol as input. If that fails (which will be the case for a symbol), try to parse a symbol instead. This allows creating a probe such as: p:probe/vlan_gro_receive 8021q:vlan_gro_receive+0 Which is necessary for this command to work: perf probe -m 8021q -a vlan_gro_receive Link: http://lkml.kernel.org/r/fd72d666f45b114e2c5b9cf7e27b91de1ec966f1.1498122881.git.sd@queasysnail.net Fixes: 413d37d1e ("tracing: Add kprobe-based event tracer") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-14sched: avoid RT tasks contention during sched boostPavankumar Kondeti
When placement boost is active, we are currently considering only the highest capacity cluster. If all of the active CPUs in this cluster are busy with RT tasks, the waking task is placed on it's previous CPU, which may be running a RT task. This results in suboptimal performance. Fix this by expanding the search to the other clusters, when there is no eligible CPU found in the highest capacity cluster. Change-Id: Iaab2e397b994c2b219dc086c7a6fa91ca26a5128 Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2017-07-12sched: walt: fix window misalignment when HZ=300Joonwoo Park
Due to rounding error hrtimer tick interval becomes 3333333 ns when HZ=300. Consequently the tick time stamp nearest to the WALT's default window size 20ms will be also 19999998 (3333333 * 6). Change-Id: I08f9bd2dbecccbb683e4490d06d8b0da703d3ab2 Suggested-by: Joel Fernandes <joelaf@google.com> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2017-07-12Merge "Merge android-4.4@64a73ff (v4.4.76) into msm-4.4"Linux Build Service Account
2017-07-11Merge "Merge android-4.4@8c91412 (v4.4.75) into msm-4.4"Linux Build Service Account
2017-07-10Merge android-4.4@64a73ff (v4.4.76) into msm-4.4Blagovest Kolenichev
* refs/heads/tmp-64a73ff: Linux 4.4.76 KVM: nVMX: Fix exception injection KVM: x86: zero base3 of unusable segments KVM: x86/vPMU: fix undefined shift in intel_pmu_refresh() KVM: x86: fix emulation of RSM and IRET instructions cpufreq: s3c2416: double free on driver init error path iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid() iommu: Handle default domain attach failure iommu/vt-d: Don't over-free page table directories ocfs2: o2hb: revert hb threshold to keep compatible x86/mm: Fix flush_tlb_page() on Xen x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space ARM: 8685/1: ensure memblock-limit is pmd-aligned ARM64/ACPI: Fix BAD_MADT_GICC_ENTRY() macro implementation sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting watchdog: bcm281xx: Fix use of uninitialized spinlock. xfrm: Oops on error in pfkey_msg2xfrm_state() xfrm: NULL dereference on allocation failure xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICY jump label: fix passing kbuild_cflags when checking for asm goto support ravb: Fix use-after-free on `ifconfig eth0 down` sctp: check af before verify address in sctp_addr_id2transport net/mlx4_core: Eliminate warning messages for SRQ_LIMIT under SRIOV perf probe: Fix to show correct locations for events on modules be2net: fix status check in be_cmd_pmac_add() s390/ctl_reg: make __ctl_load a full memory barrier swiotlb: ensure that page-sized mappings are page-aligned coredump: Ensure proper size of sparse core files x86/mpx: Use compatible types in comparison to fix sparse error mac80211: initialize SMPS field in HT capabilities spi: davinci: use dma_mapping_error() scsi: lpfc: avoid double free of resource identifiers HID: i2c-hid: Add sleep between POWER ON and RESET kernel/panic.c: add missing \n ibmveth: Add a proper check for the availability of the checksum features vxlan: do not age static remote mac entries virtio_net: fix PAGE_SIZE > 64k vfio/spapr: fail tce_iommu_attach_group() when iommu_data is null drm/amdgpu: check ring being ready before using net: dsa: Check return value of phy_connect_direct() amd-xgbe: Check xgbe_init() return code platform/x86: ideapad-laptop: handle ACPI event 1 scsi: virtio_scsi: Reject commands when virtqueue is broken xen-netfront: Fix Rx stall during network stress and OOM swiotlb-xen: update dev_addr after swapping pages virtio_console: fix a crash in config_work_handler Btrfs: fix truncate down when no_holes feature is enabled gianfar: Do not reuse pages from emergency reserve powerpc/eeh: Enable IO path on permanent error net: bgmac: Remove superflous netif_carrier_on() net: bgmac: Start transmit queue in bgmac_open net: bgmac: Fix SOF bit checking bgmac: Fix reversed test of build_skb() return value. mtd: bcm47xxpart: don't fail because of bit-flips bgmac: fix a missing check for build_skb mtd: bcm47xxpart: limit scanned flash area on BCM47XX (MIPS) only MIPS: ralink: fix MT7628 wled_an pinmux gpio MIPS: ralink: fix MT7628 pinmux typos MIPS: ralink: Fix invalid assignment of SoC type MIPS: ralink: fix USB frequency scaling MIPS: ralink: MT7688 pinmux fixes net: korina: Fix NAPI versus resources freeing MIPS: ath79: fix regression in PCI window initialization net: mvneta: Fix for_each_present_cpu usage ARM: dts: BCM5301X: Correct GIC_PPI interrupt flags qla2xxx: Fix erroneous invalid handle message scsi: lpfc: Set elsiocb contexts to NULL after freeing it scsi: sd: Fix wrong DPOFUA disable in sd_read_cache_type KVM: x86: fix fixing of hypercalls mm: numa: avoid waiting on freed migrated pages block: fix module reference leak on put_disk() call for cgroups throttle sysctl: enable strict writes usb: gadget: f_fs: Fix possibe deadlock drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr ALSA: hda - set input_path bitmap to zero after moving it to new place ALSA: hda - Fix endless loop of codec configure MIPS: Fix IRQ tracing & lockdep when rescheduling MIPS: pm-cps: Drop manual cache-line alignment of ready_count MIPS: Avoid accidental raw backtrace mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff() drm/ast: Handle configuration without P2A bridge NFSv4: fix a reference leak caused WARNING messages netfilter: synproxy: fix conntrackd interaction netfilter: xt_TCPMSS: add more sanity tests on tcph->doff rtnetlink: add IFLA_GROUP to ifla_policy ipv6: Do not leak throw route references sfc: provide dummy definitions of vswitch functions net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev decnet: always not take dst->__refcnt when inserting dst into hash table net/mlx5: Wait for FW readiness before initializing command interface ipv6: fix calling in6_ifa_hold incorrectly for dad work igmp: add a missing spin_lock_init() igmp: acquire pmc lock for ip_mc_clear_src() net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx Fix an intermittent pr_emerg warning about lo becoming free. af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers net: Zero ifla_vf_info in rtnl_fill_vfinfo() decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb net: don't call strlen on non-terminated string in dev_set_alias() ipv6: release dst on error in ip6_dst_lookup_tail UPSTREAM: selinux: enable genfscon labeling for tracefs Change-Id: I05ae1d6271769a99ea3817e5066f5ab6511f3254 Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2017-07-07Revert "sched: Remove synchronize rcu/sched calls from _cpu_down"Mohammed Khajapasha
This reverts commit 36131fdc87c8 ("sched: Remove synchronize rcu/sched calls from _cpu_down"). Removing the synchronization of rcu/sched calls from _cpu_down introduces a race where tasks may get queued on an inactive CPU and unthrottling cfs_rqs. Change-Id: Ie29f8d185eb55979f9ca4e6e1b767caba6dd7f27 Signed-off-by: Mohammed Khajapasha <mkhaja@codeaurora.org>
2017-07-06Merge "genirq: Don't allow user space to set IRQ affinity to isolated CPUs"Linux Build Service Account
2017-07-06Merge android-4.4@8c91412 (v4.4.75) into msm-4.4Blagovest Kolenichev
* refs/heads/tmp-8c91412: Linux 4.4.75 nvme: apply DELAY_BEFORE_CHK_RDY quirk at probe time too nvme/quirk: Add a delay before checking for adapter readiness net: phy: fix marvell phy status reading net: phy: Initialize mdio clock at probe function usb: gadget: f_fs: avoid out of bounds access on comp_desc powerpc/slb: Force a full SLB flush when we insert for a bad EA mtd: spi-nor: fix spansion quad enable of: Add check to of_scan_flat_dt() before accessing initial_boot_params rxrpc: Fix several cases where a padded len isn't checked in ticket decode USB: usbip: fix nonconforming hub descriptor drm/amdgpu: adjust default display clock drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating drm/radeon: add a quirk for Toshiba Satellite L20-183 drm/radeon: add a PX quirk for another K53TK variant iscsi-target: Reject immediate data underflow larger than SCSI transfer length target: Fix kref->refcount underflow in transport_cmd_finish_abort time: Fix clock->read(clock) race around clocksource changes Input: i8042 - add Fujitsu Lifebook AH544 to notimeout list powerpc/kprobes: Pause function_graph tracing during jprobes handling signal: Only reschedule timers on signals timers have sent HID: Add quirk for Dell PIXART OEM mouse CIFS: Improve readdir verbosity KVM: PPC: Book3S HV: Preserve userspace HTM state properly lib/cmdline.c: fix get_options() overflow while parsing ranges autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL fs/exec.c: account for argv/envp pointers UPSTREAM: drivers/perf: arm-pmu: fix RCU usage on pmu resume from low-power UPSTREAM: drivers/perf: arm_pmu: implement CPU_PM notifier ANDROID: squashfs: Fix endianness issue ANDROID: squashfs: Fix signed division issue Change-Id: Iabe0921dd7b9a582f5237235338ef0f730de7edb Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2017-07-05Merge 4.4.76 into android-4.4Greg Kroah-Hartman
Changes in 4.4.76 ipv6: release dst on error in ip6_dst_lookup_tail net: don't call strlen on non-terminated string in dev_set_alias() decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb net: Zero ifla_vf_info in rtnl_fill_vfinfo() af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers Fix an intermittent pr_emerg warning about lo becoming free. net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx igmp: acquire pmc lock for ip_mc_clear_src() igmp: add a missing spin_lock_init() ipv6: fix calling in6_ifa_hold incorrectly for dad work net/mlx5: Wait for FW readiness before initializing command interface decnet: always not take dst->__refcnt when inserting dst into hash table net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev sfc: provide dummy definitions of vswitch functions ipv6: Do not leak throw route references rtnetlink: add IFLA_GROUP to ifla_policy netfilter: xt_TCPMSS: add more sanity tests on tcph->doff netfilter: synproxy: fix conntrackd interaction NFSv4: fix a reference leak caused WARNING messages drm/ast: Handle configuration without P2A bridge mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff() MIPS: Avoid accidental raw backtrace MIPS: pm-cps: Drop manual cache-line alignment of ready_count MIPS: Fix IRQ tracing & lockdep when rescheduling ALSA: hda - Fix endless loop of codec configure ALSA: hda - set input_path bitmap to zero after moving it to new place drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr usb: gadget: f_fs: Fix possibe deadlock sysctl: enable strict writes block: fix module reference leak on put_disk() call for cgroups throttle mm: numa: avoid waiting on freed migrated pages KVM: x86: fix fixing of hypercalls scsi: sd: Fix wrong DPOFUA disable in sd_read_cache_type scsi: lpfc: Set elsiocb contexts to NULL after freeing it qla2xxx: Fix erroneous invalid handle message ARM: dts: BCM5301X: Correct GIC_PPI interrupt flags net: mvneta: Fix for_each_present_cpu usage MIPS: ath79: fix regression in PCI window initialization net: korina: Fix NAPI versus resources freeing MIPS: ralink: MT7688 pinmux fixes MIPS: ralink: fix USB frequency scaling MIPS: ralink: Fix invalid assignment of SoC type MIPS: ralink: fix MT7628 pinmux typos MIPS: ralink: fix MT7628 wled_an pinmux gpio mtd: bcm47xxpart: limit scanned flash area on BCM47XX (MIPS) only bgmac: fix a missing check for build_skb mtd: bcm47xxpart: don't fail because of bit-flips bgmac: Fix reversed test of build_skb() return value. net: bgmac: Fix SOF bit checking net: bgmac: Start transmit queue in bgmac_open net: bgmac: Remove superflous netif_carrier_on() powerpc/eeh: Enable IO path on permanent error gianfar: Do not reuse pages from emergency reserve Btrfs: fix truncate down when no_holes feature is enabled virtio_console: fix a crash in config_work_handler swiotlb-xen: update dev_addr after swapping pages xen-netfront: Fix Rx stall during network stress and OOM scsi: virtio_scsi: Reject commands when virtqueue is broken platform/x86: ideapad-laptop: handle ACPI event 1 amd-xgbe: Check xgbe_init() return code net: dsa: Check return value of phy_connect_direct() drm/amdgpu: check ring being ready before using vfio/spapr: fail tce_iommu_attach_group() when iommu_data is null virtio_net: fix PAGE_SIZE > 64k vxlan: do not age static remote mac entries ibmveth: Add a proper check for the availability of the checksum features kernel/panic.c: add missing \n HID: i2c-hid: Add sleep between POWER ON and RESET scsi: lpfc: avoid double free of resource identifiers spi: davinci: use dma_mapping_error() mac80211: initialize SMPS field in HT capabilities x86/mpx: Use compatible types in comparison to fix sparse error coredump: Ensure proper size of sparse core files swiotlb: ensure that page-sized mappings are page-aligned s390/ctl_reg: make __ctl_load a full memory barrier be2net: fix status check in be_cmd_pmac_add() perf probe: Fix to show correct locations for events on modules net/mlx4_core: Eliminate warning messages for SRQ_LIMIT under SRIOV sctp: check af before verify address in sctp_addr_id2transport ravb: Fix use-after-free on `ifconfig eth0 down` jump label: fix passing kbuild_cflags when checking for asm goto support xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICY xfrm: NULL dereference on allocation failure xfrm: Oops on error in pfkey_msg2xfrm_state() watchdog: bcm281xx: Fix use of uninitialized spinlock. sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting ARM64/ACPI: Fix BAD_MADT_GICC_ENTRY() macro implementation ARM: 8685/1: ensure memblock-limit is pmd-aligned x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space x86/mm: Fix flush_tlb_page() on Xen ocfs2: o2hb: revert hb threshold to keep compatible iommu/vt-d: Don't over-free page table directories iommu: Handle default domain attach failure iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid() cpufreq: s3c2416: double free on driver init error path KVM: x86: fix emulation of RSM and IRET instructions KVM: x86/vPMU: fix undefined shift in intel_pmu_refresh() KVM: x86: zero base3 of unusable segments KVM: nVMX: Fix exception injection Linux 4.4.76 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-07-05sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accountingMatt Fleming
commit 6e5f32f7a43f45ee55c401c0b9585eb01f9629a8 upstream. If we crossed a sample window while in NO_HZ we will add LOAD_FREQ to the pending sample window time on exit, setting the next update not one window into the future, but two. This situation on exiting NO_HZ is described by: this_rq->calc_load_update < jiffies < calc_load_update In this scenario, what we should be doing is: this_rq->calc_load_update = calc_load_update [ next window ] But what we actually do is: this_rq->calc_load_update = calc_load_update + LOAD_FREQ [ next+1 window ] This has the effect of delaying load average updates for potentially up to ~9seconds. This can result in huge spikes in the load average values due to per-cpu uninterruptible task counts being out of sync when accumulated across all CPUs. It's safe to update the per-cpu active count if we wake between sample windows because any load that we left in 'calc_load_idle' will have been zero'd when the idle load was folded in calc_global_load(). This issue is easy to reproduce before, commit 9d89c257dfb9 ("sched/fair: Rewrite runnable load and utilization average tracking") just by forking short-lived process pipelines built from ps(1) and grep(1) in a loop. I'm unable to reproduce the spikes after that commit, but the bug still seems to be present from code review. Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincent Guittot <vincent.guittot@linaro.org> Fixes: commit 5167e8d ("sched/nohz: Rewrite and fix load-avg computation -- again") Link: http://lkml.kernel.org/r/20170217120731.11868-2-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05kernel/panic.c: add missing \nJiri Slaby
[ Upstream commit ff7a28a074ccbea999dadbb58c46212cf90984c6 ] When a system panics, the "Rebooting in X seconds.." message is never printed because it lacks a new line. Fix it. Link: http://lkml.kernel.org/r/20170119114751.2724-1-jslaby@suse.cz Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05sysctl: enable strict writesKees Cook
commit 41662f5cc55335807d39404371cfcbb1909304c4 upstream. SYSCTL_WRITES_WARN was added in commit f4aacea2f5d1 ("sysctl: allow for strict write position handling"), and released in v3.16 in August of 2014. Since then I can find only 1 instance of non-zero offset writing[1], and it was fixed immediately in CRIU[2]. As such, it appears safe to flip this to the strict state now. [1] https://www.google.com/search?q="when%20file%20position%20was%20not%200" [2] http://lists.openvz.org/pipermail/criu/2015-April/019819.html Signed-off-by: Kees Cook <keescook@chromium.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-04Merge "genirq: honour default IRQ affinity setting during migration"Linux Build Service Account
2017-07-03Merge "osq_lock: fix osq_lock queue corruption"Linux Build Service Account
2017-07-03Merge "Merge branch 'android-4.4@77ddb50' (v4.4.74) into 'msm-4.4'"Linux Build Service Account
2017-07-03Merge "osq_lock: avoid live-lock issue for RT task"Linux Build Service Account
2017-07-03Merge "cpu-hotplug: Keep atleast 1 online and un-isolated CPU"Linux Build Service Account
2017-07-03Merge "cgroup: Fix potential race between cgroup_exit and migrate path"Linux Build Service Account
2017-07-02osq_lock: fix osq_lock queue corruptionPrateek Sood
Fix ordering of link creation between node->prev and prev->next in osq_lock(). A case in which the status of optimistic spin queue is CPU6->CPU2 in which CPU6 has acquired the lock. At this point if CPU0 comes in to acquire osq_lock, it will update the tail count. After tail count update if CPU2 starts to unqueue itself from optimistic spin queue, it will find updated tail count with CPU0 and update CPU2 node->next to NULL in osq_wait_next(). If reordering of following stores happen then prev->next where prev being CPU2 would be updated to point to CPU0 node: node->prev = prev; WRITE_ONCE(prev->next, node); At this point if next instruction WRITE_ONCE(next->prev, prev); in CPU2 path is committed before the update of CPU0 node->prev = prev then CPU0 node->prev will point to CPU6 node. At this point if CPU0 path's node->prev = prev is committed resulting in change of CPU0 prev back to CPU2 node. CPU2 node->next is NULL currently, so if CPU0 gets into unqueue path of osq_lock it will keep spinning in infinite loop as condition prev->next == node will never be true. Change-Id: I48d847096daf3c228de90ae1cd2a6415b7bde65a Signed-off-by: Prateek Sood <prsood@codeaurora.org>
2017-06-30osq_lock: avoid live-lock issue for RT taskPrateek Sood
Live Lock due to task spinning while unqueue of CPU osq_node from optimistic_spin_queue. Task T1 had decremented mutex count to acquire the lock on CPU0. Before setting owner it got preempted. On CPU1 task T2 acquired osq_lock and started spinning on owner of mutex with preemption disabled. CPU1 runq has one task, so need_resched will not be set. On CPU0 task T3 tried to acquire osq_lock to spin on the same mutex. At this time following scenario causes soft lockup: After preemption of task T1, RT task T3 tried to acquire the same mutex. It will start spinning on the osq_lock until the lock is available or need_resched is set. For RT task, need_resched will not be set. Task T3 will not be able to bail out of the infinite loop. Change-Id: Ifd7506047119a22e14b15459ac6b04b410ba1c84 Signed-off-by: Prateek Sood <prsood@codeaurora.org>
2017-06-30genirq: Don't allow user space to set IRQ affinity to isolated CPUsPavankumar Kondeti
The PM_QOS_CPU_DMA_LATENCY QOS request attached to an IRQ is ignored if the IRQ is affined to an isolated CPU. As isolated CPUs enter deep sleep state, it is better not to affine IRQs to those CPUs. Change-Id: Ieab4a04eca222b91159208b21bc9e14390ecd62e Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>