summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)Author
2014-02-11powerpc/powernv: Add iommu DMA bypass support for IODA2Benjamin Herrenschmidt
This patch adds the support for to create a direct iommu "bypass" window on IODA2 bridges (such as Power8) allowing to bypass iommu page translation completely for 64-bit DMA capable devices, thus significantly improving DMA performances. Additionally, this adds a hook to the struct iommu_table so that the IOMMU API / VFIO can disable the bypass when external ownership is requested, since in that case, the device will be used by an environment such as userspace or a KVM guest which must not be allowed to bypass translations. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11powerpc: Fix endian issues in kexec and crash dump codeAnton Blanchard
We expose a number of OF properties in the kexec and crash dump code and these need to be big endian. Cc: stable@vger.kernel.org # v3.13 Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11powerpc/ppc32: Fix the bug in the init of non-base exception stack for UPKevin Hao
We would allocate one specific exception stack for each kind of non-base exceptions for every CPU. For ppc32 the CPU hard ID is used as the subscript to get the specific exception stack for one CPU. But for an UP kernel, there is only one element in the each kind of exception stack array. We would get stuck if the CPU hard ID is not equal to '0'. So in this case we should use the subscript '0' no matter what the CPU hard ID is. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11powerpc/xmon: Don't signal we've entered until we're finished printingMichael Ellerman
Currently we set our cpu's bit in cpus_in_xmon, and then we take the output lock and print the exception information. This can race with the master cpu entering the command loop and printing the backtrace. The result is that the backtrace gets garbled with another cpu's exception print out. Fix it by delaying the set of cpus_in_xmon until we are finished printing. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11powerpc/xmon: Fix timeout loop in get_output_lock()Michael Ellerman
As far as I can tell, our 70s era timeout loop in get_output_lock() is generating no code. This leads to the hostile takeover happening more or less simultaneously on all cpus. The result is "interesting", some example output that is more readable than most: cpu 0x1: Vector: 100 (Scypsut e0mx bR:e setV)e catto xc0p:u[ c 00 c0:0 000t0o0V0erc0td:o5 rfc28050000]0c00 0 0 0 6t(pSrycsV1ppuot uxe 1m 2 0Rx21e3:0s0ce000c00000t00)00 60602oV2SerucSayt0y 0p 1sxs Fix it by using udelay() in the timeout loop. The wait time and check frequency are arbitrary, but seem to work OK. We already rely on udelay() working so this is not a new dependency. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11powerpc/xmon: Don't loop forever in get_output_lock()Michael Ellerman
If we enter with xmon_speaker != 0 we skip the first cmpxchg(), we also skip the while loop because xmon_speaker != last_speaker (0) - meaning we skip the second cmpxchg() also. Following that code path the compiler sees no memory barriers and so is within its rights to never reload xmon_speaker. The end result is we loop forever. This manifests as all cpus being in xmon ('c' command), but they refuse to take control when you switch to them ('c x' for cpu # x). I have seen this deadlock in practice and also checked the generated code to confirm this is what's happening. The simplest fix is just to always try the cmpxchg(). Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11powerpc/perf: Configure BHRB filter before enabling PMU interruptsAnshuman Khandual
Right now the config_bhrb() PMU specific call happens after write_mmcr0(), which actually enables the PMU for event counting and interrupts. So there is a small window of time where the PMU and BHRB runs without the required HW branch filter (if any) enabled in BHRB. This can cause some of the branch samples to be collected through BHRB without any filter applied and hence affects the correctness of the results. This patch moves the BHRB config function call before enabling interrupts. Here are some data points captured via trace prints which depicts how we could get PMU interrupts with BHRB filter NOT enabled with a standard perf record command line (asking for branch record information as well). $ perf record -j any_call ls Before the patch:- ls-1962 [003] d... 2065.299590: .perf_event_interrupt: MMCRA: 40000000000 ls-1962 [003] d... 2065.299603: .perf_event_interrupt: MMCRA: 40000000000 ... All the PMU interrupts before this point did not have the requested HW branch filter enabled in the MMCRA. ls-1962 [003] d... 2065.299647: .perf_event_interrupt: MMCRA: 40040000000 ls-1962 [003] d... 2065.299662: .perf_event_interrupt: MMCRA: 40040000000 After the patch:- ls-1850 [008] d... 190.311828: .perf_event_interrupt: MMCRA: 40040000000 ls-1850 [008] d... 190.311848: .perf_event_interrupt: MMCRA: 40040000000 All the PMU interrupts have the requested HW BHRB branch filter enabled in MMCRA. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> [mpe: Fixed up whitespace and cleaned up changelog] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11powerpc/pseries: Select ARCH_RANDOM on pseriesMichael Ellerman
We have a driver for the ARCH_RANDOM hook in rng.c, so we should select ARCH_RANDOM on pseries. Without this the build breaks if you turn ARCH_RANDOM off. This hasn't broken the build because pseries_defconfig doesn't specify a value for PPC_POWERNV, which is default y, and selects ARCH_RANDOM. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11powerpc/perf: Add Power8 cache & TLB eventsMichael Ellerman
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11powerpc/relocate fix relocate processing in LE modeLaurent Dufour
Relocation's code is not working in little endian mode because the r_info field, which is a 64 bits value, should be read from the right offset. The current code is optimized to read the r_info field as a 32 bits value starting at the middle of the double word (offset 12). When running in LE mode, the read value is not correct since only the MSB is read. This patch removes this optimization which consist to deal with a 32 bits value instead of a 64 bits one. This way it works in big and little endian mode. Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11powerpc: Fix kdump hang issue on p8 with relocation on exception enabled.Mahesh Salgaonkar
On p8 systems, with relocation on exception feature enabled we are seeing kdump kernel hang at interrupt vector 0xc*4400. The reason is, with this feature enabled, exception are raised with MMU (IR=DR=1) ON with the default offset of 0xc*4000. Since exception is raised in virtual mode it requires the vector region to be executable without which it fails to fetch and execute instruction at 0xc*4xxx. For default kernel since kernel is loaded at real 0, the htab mappings sets the entire kernel text region executable. But for relocatable kernel (e.g. kdump case) we only copy interrupt vectors down to real 0 and never marked that region as executable because in p7 and below we always get exception in real mode. This patch fixes this issue by marking htab mapping range as executable that overlaps with the interrupt vector region for relocatable kernel. Thanks to Ben who helped me to debug this issue and find the root cause. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11powerpc/pseries: Disable relocation on exception while going down during crash.Mahesh Salgaonkar
Disable relocation on exception while going down even in kdump case. This is because we are about clear htab mappings while kexec-ing into kdump kernel and we may run into issues if we still have AIL ON. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11powerpc/eeh: Drop taken reference to driver on eeh_rmv_deviceThadeu Lima de Souza Cascardo
Commit f5c57710dd62dd06f176934a8b4b8accbf00f9f8 ("powerpc/eeh: Use partial hotplug for EEH unaware drivers") introduces eeh_rmv_device, which may grab a reference to a driver, but not release it. That prevents a driver from being removed after it has gone through EEH recovery. This patch drops the reference if it was taken. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11powerpc: Fix build failure in sysdev/mpic.c for MPIC_WEIRD=yPaul Gortmaker
Commit 446f6d06fab0b49c61887ecbe8286d6aaa796637 ("powerpc/mpic: Properly set default triggers") breaks the mpc7447_hpc_defconfig as follows: CC arch/powerpc/sysdev/mpic.o arch/powerpc/sysdev/mpic.c: In function 'mpic_set_irq_type': arch/powerpc/sysdev/mpic.c:886:9: error: case label does not reduce to an integer constant arch/powerpc/sysdev/mpic.c:890:9: error: case label does not reduce to an integer constant arch/powerpc/sysdev/mpic.c:894:9: error: case label does not reduce to an integer constant arch/powerpc/sysdev/mpic.c:898:9: error: case label does not reduce to an integer constant Looking at the cpp output (gcc 4.7.3), I see: case mpic->hw_set[MPIC_IDX_VECPRI_SENSE_EDGE] | mpic->hw_set[MPIC_IDX_VECPRI_POLARITY_POSITIVE]: The pointer into an array appears because CONFIG_MPIC_WEIRD=y is set for this platform, thus enabling the following: ------------------- #ifdef CONFIG_MPIC_WEIRD static u32 mpic_infos[][MPIC_IDX_END] = { [0] = { /* Original OpenPIC compatible MPIC */ [...] #define MPIC_INFO(name) mpic->hw_set[MPIC_IDX_##name] #else /* CONFIG_MPIC_WEIRD */ #define MPIC_INFO(name) MPIC_##name #endif /* CONFIG_MPIC_WEIRD */ ------------------- Here we convert the case section to if/else if, and also add the equivalent of a default case to warn about unknown types. Boot tested on sbc8548, build tested on all defconfigs. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-31Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull more KVM updates from Paolo Bonzini: "Second batch of KVM updates. Some minor x86 fixes, two s390 guest features that need some handling in the host, and all the PPC changes. The PPC changes include support for little-endian guests and enablement for new POWER8 features" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (45 commits) x86, kvm: correctly access the KVM_CPUID_FEATURES leaf at 0x40000101 x86, kvm: cache the base of the KVM cpuid leaves kvm: x86: move KVM_CAP_HYPERV_TIME outside #ifdef KVM: PPC: Book3S PR: Cope with doorbell interrupts KVM: PPC: Book3S HV: Add software abort codes for transactional memory KVM: PPC: Book3S HV: Add new state for transactional memory powerpc/Kconfig: Make TM select VSX and VMX KVM: PPC: Book3S HV: Basic little-endian guest support KVM: PPC: Book3S HV: Add support for DABRX register on POWER7 KVM: PPC: Book3S HV: Prepare for host using hypervisor doorbells KVM: PPC: Book3S HV: Handle new LPCR bits on POWER8 KVM: PPC: Book3S HV: Handle guest using doorbells for IPIs KVM: PPC: Book3S HV: Consolidate code that checks reason for wake from nap KVM: PPC: Book3S HV: Implement architecture compatibility modes for POWER8 KVM: PPC: Book3S HV: Add handler for HV facility unavailable KVM: PPC: Book3S HV: Flush the correct number of TLB sets on POWER8 KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs KVM: PPC: Book3S HV: Align physical and virtual CPU thread numbers KVM: PPC: Book3S HV: Don't set DABR on POWER8 kvm/ppc: IRQ disabling cleanup ...
2014-01-30Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull more powerpc bits from Ben Herrenschmidt: "Here are a few more powerpc bits for this merge window. The bulk is made of two pull requests from Scott and Anatolij that I had missed previously (they arrived while I was away). Since both their branches are in -next independently, and the content has been around for a little while, they can still go in. The rest is mostly bug and regression fixes, a small series of cleanups to our pseries cpuidle code (including moving it to the right place), and one new cpuidle bakend for the powernv platform. I also wired up the new sched_attr syscalls" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (37 commits) powerpc: Wire up sched_setattr and sched_getattr syscalls powerpc/hugetlb: Replace __get_cpu_var with get_cpu_var powerpc: Make sure "cache" directory is removed when offlining cpu powerpc/mm: Fix mmap errno when MAP_FIXED is set and mapping exceeds the allowed address space powerpc/powernv/cpuidle: Back-end cpuidle driver for powernv platform. powerpc/pseries/cpuidle: smt-snooze-delay cleanup. powerpc/pseries/cpuidle: Remove MAX_IDLE_STATE macro. powerpc/pseries/cpuidle: Make cpuidle-pseries backend driver a non-module. powerpc/pseries/cpuidle: Use cpuidle_register() for initialisation. powerpc/pseries/cpuidle: Move processor_idle.c to drivers/cpuidle. powerpc: Fix 32-bit frames for signals delivered when transactional powerpc/iommu: Fix initialisation of DART iommu table powerpc/numa: Fix decimal permissions powerpc/mm: Fix compile error of pgtable-ppc64.h powerpc: Fix hw breakpoints on !HAVE_HW_BREAKPOINT configurations clk: corenet: Adds the clock binding powerpc/booke64: Guard e6500 tlb handler with CONFIG_PPC_FSL_BOOK3E powerpc/512x: dts: add MPC5125 clock specs powerpc/512x: clk: support MPC5121/5123/5125 SoC variants powerpc/512x: clk: enforce even SDHC divider values ...
2014-01-30Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block IO changes from Jens Axboe: "The major piece in here is the immutable bio_ve series from Kent, the rest is fairly minor. It was supposed to go in last round, but various issues pushed it to this release instead. The pull request contains: - Various smaller blk-mq fixes from different folks. Nothing major here, just minor fixes and cleanups. - Fix for a memory leak in the error path in the block ioctl code from Christian Engelmayer. - Header export fix from CaiZhiyong. - Finally the immutable biovec changes from Kent Overstreet. This enables some nice future work on making arbitrarily sized bios possible, and splitting more efficient. Related fixes to immutable bio_vecs: - dm-cache immutable fixup from Mike Snitzer. - btrfs immutable fixup from Muthu Kumar. - bio-integrity fix from Nic Bellinger, which is also going to stable" * 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits) xtensa: fixup simdisk driver to work with immutable bio_vecs block/blk-mq-cpu.c: use hotcpu_notifier() blk-mq: for_each_* macro correctness block: Fix memory leak in rw_copy_check_uvector() handling bio-integrity: Fix bio_integrity_verify segment start bug block: remove unrelated header files and export symbol blk-mq: uses page->list incorrectly blk-mq: use __smp_call_function_single directly btrfs: fix missing increment of bi_remaining Revert "block: Warn and free bio if bi_end_io is not set" block: Warn and free bio if bi_end_io is not set blk-mq: fix initializing request's start time block: blk-mq: don't export blk_mq_free_queue() block: blk-mq: make blk_sync_queue support mq block: blk-mq: support draining mq queue dm cache: increment bi_remaining when bi_end_io is restored block: fixup for generic bio chaining block: Really silence spurious compiler warnings block: Silence spurious compiler warnings block: Kill bio_pair_split() ...
2014-01-29Merge branch 'kvm-ppc-next' of git://github.com/agraf/linux-2.6 into kvm-queuePaolo Bonzini
Conflicts: arch/powerpc/kvm/book3s_hv_rmhandlers.S arch/powerpc/kvm/booke.c
2014-01-29powerpc: Wire up sched_setattr and sched_getattr syscallsBenjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-29powerpc/hugetlb: Replace __get_cpu_var with get_cpu_varTiejun Chen
Replace __get_cpu_var safely with get_cpu_var to avoid the following call trace: [ 7253.637591] BUG: using smp_processor_id() in preemptible [00000000 00000000] code: hugemmap01/9048 [ 7253.637601] caller is free_hugepd_range.constprop.25+0x88/0x1a8 [ 7253.637605] CPU: 1 PID: 9048 Comm: hugemmap01 Not tainted 3.10.20-rt14+ #114 [ 7253.637606] Call Trace: [ 7253.637617] [cb049d80] [c0007ea4] show_stack+0x4c/0x168 (unreliable) [ 7253.637624] [cb049dc0] [c031c674] debug_smp_processor_id+0x114/0x134 [ 7253.637628] [cb049de0] [c0016d28] free_hugepd_range.constprop.25+0x88/0x1a8 [ 7253.637632] [cb049e00] [c001711c] hugetlb_free_pgd_range+0x6c/0x168 [ 7253.637639] [cb049e40] [c0117408] free_pgtables+0x12c/0x150 [ 7253.637646] [cb049e70] [c011ce38] unmap_region+0xa0/0x11c [ 7253.637671] [cb049ef0] [c011f03c] do_munmap+0x224/0x3bc [ 7253.637676] [cb049f20] [c011f2e0] vm_munmap+0x38/0x5c [ 7253.637682] [cb049f40] [c000ef88] ret_from_syscall+0x0/0x3c [ 7253.637686] --- Exception: c01 at 0xff16004 Signed-off-by: Tiejun Chen<tiejun.chen@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-29powerpc: Make sure "cache" directory is removed when offlining cpuPaul Mackerras
The code in remove_cache_dir() is supposed to remove the "cache" subdirectory from the sysfs directory for a CPU when that CPU is being offlined. It tries to do this by calling kobject_put() on the kobject for the subdirectory. However, the subdirectory only gets removed once the last reference goes away, and the reference being put here may well not be the last reference. That means that the "cache" subdirectory may still exist when the offlining operation has finished. If the same CPU subsequently gets onlined, the code tries to add a new "cache" subdirectory. If the old subdirectory has not yet been removed, we get a WARN_ON in the sysfs code, with stack trace, and an error message printed on the console. Further, we ultimately end up with an online cpu with no "cache" subdirectory. This fixes it by doing an explicit kobject_del() at the point where we want the subdirectory to go away. kobject_del() removes the sysfs directory even though the object still exists in memory. The object will get freed at some point in the future. A subsequent onlining operation can create a new sysfs directory, even if the old object still exists in memory, without causing any problems. Cc: stable@vger.kernel.org # v3.0+ Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-29powerpc/mm: Fix mmap errno when MAP_FIXED is set and mapping exceeds the ↵jmarchan@redhat.com
allowed address space According to Posix, if MAP_FIXED is specified mmap shall set ENOMEM if the requested mapping exceeds the allowed range for address space of the process. The generic code set it right, but the specific powerpc slice_get_unmapped_area() function currently returns -EINVAL in that case. This patch corrects it. Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-29powerpc/powernv/cpuidle: Back-end cpuidle driver for powernv platform.Deepthi Dharwar
Following patch ports the cpuidle framework for powernv platform and also implements a cpuidle back-end powernv idle driver calling on to power7_nap and snooze idle states. Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-29powerpc/pseries/cpuidle: smt-snooze-delay cleanup.Deepthi Dharwar
smt-snooze-delay was designed to disable NAP state or delay the entry to the NAP state prior to adoption of cpuidle framework. This is per-cpu variable. With the coming of CPUIDLE framework, states can be disabled on per-cpu basis using the cpuidle/enable sysfs entry. Also, with the coming of cpuidle driver each state's target residency is per-driver unlike earlier which was per-device. Therefore, the per-cpu sysfs smt-snooze-delay which decides the target residency of the idle state on a particular cpu causes more confusion to the user as we cannot have different smt-snooze-delay (target residency) values for each cpu. In the current code, smt-snooze-delay functionality is completely broken. It makes sense to remove smt-snooze-delay from idle driver with the coming of cpuidle framework. However, sysfs files are retained as ppc64_util currently utilises it. Once we fix ppc64_util, propose to clean up the kernel code. Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-29powerpc/pseries/cpuidle: Move processor_idle.c to drivers/cpuidle.Deepthi Dharwar
Move the file from arch specific pseries/processor_idle.c to drivers/cpuidle/cpuidle-pseries.c Make the relevant Makefile and Kconfig changes. Also, introduce Kconfig.powerpc in drivers/cpuidle for all powerpc cpuidle drivers. Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-29powerpc: Fix 32-bit frames for signals delivered when transactionalPaul Mackerras
Commit d31626f70b61 ("powerpc: Don't corrupt transactional state when using FP/VMX in kernel") introduced a bug where the uc_link and uc_regs fields of the ucontext_t that is created to hold the transactional values of the registers in a 32-bit signal frame didn't get set correctly. The reason is that we now clear the MSR_TS bits in the MSR in save_tm_user_regs(), before the code that sets uc_link and uc_regs. To fix this, we move the setting of uc_link and uc_regs into the same if statement that selects whether to call save_tm_user_regs() or save_user_regs(). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-29powerpc/iommu: Fix initialisation of DART iommu tableAlistair Popple
Commit d084775738b746648d4102337163a04534a02982 switched the generic powerpc iommu backend code to use the it_page_shift field to determine page size. Commit 3a553170d35d69bea3877bffa508489dfa6f133d should have initiliased this field for all platforms, however the DART iommu table code was not updated. This commit initialises the it_page_shift field to 4K for the DART iommu. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-29powerpc/numa: Fix decimal permissionsJoe Perches
This should have been octal. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-29powerpc/mm: Fix compile error of pgtable-ppc64.hLi Zhong
It seems that forward declaration couldn't work well with typedef, use struct spinlock directly to avoiding following build errors: In file included from include/linux/spinlock.h:81, from include/linux/seqlock.h:35, from include/linux/time.h:5, from include/uapi/linux/timex.h:56, from include/linux/timex.h:56, from include/linux/sched.h:17, from arch/powerpc/kernel/asm-offsets.c:17: include/linux/spinlock_types.h:76: error: redefinition of typedef 'spinlock_t' /root/linux-next/arch/powerpc/include/asm/pgtable-ppc64.h:563: note: previous declaration of 'spinlock_t' was here Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-29powerpc: Fix hw breakpoints on !HAVE_HW_BREAKPOINT configurationsAndreas Schwab
This fixes a logic error that caused a failure to update the hw breakpoint registers when not using the hw-breakpoint interface. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-29Merge remote-tracking branch 'scott/next' into nextBenjamin Herrenschmidt
<< This contains a fix for a chroma_defconfig build break that was introduced by e6500 tablewalk support, and a device tree binding patch that missed the previous pull request due to some last-minute polishing. >>
2014-01-29Merge remote-tracking branch 'agust/next' into nextBenjamin Herrenschmidt
<< Switch mpc512x to the common clock framework and adapt mpc512x drivers to use the new clock driver. Old PPC_CLOCK code is removed entirely since there are no users any more. >>
2014-01-27Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Ben Herrenschmidt: "So here's my next branch for powerpc. A bit late as I was on vacation last week. It's mostly the same stuff that was in next already, I just added two patches today which are the wiring up of lockref for powerpc, which for some reason fell through the cracks last time and is trivial. The highlights are, in addition to a bunch of bug fixes: - Reworked Machine Check handling on kernels running without a hypervisor (or acting as a hypervisor). Provides hooks to handle some errors in real mode such as TLB errors, handle SLB errors, etc... - Support for retrieving memory error information from the service processor on IBM servers running without a hypervisor and routing them to the memory poison infrastructure. - _PAGE_NUMA support on server processors - 32-bit BookE relocatable kernel support - FSL e6500 hardware tablewalk support - A bunch of new/revived board support - FSL e6500 deeper idle states and altivec powerdown support You'll notice a generic mm change here, it has been acked by the relevant authorities and is a pre-req for our _PAGE_NUMA support" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (121 commits) powerpc: Implement arch_spin_is_locked() using arch_spin_value_unlocked() powerpc: Add support for the optimised lockref implementation powerpc/powernv: Call OPAL sync before kexec'ing powerpc/eeh: Escalate error on non-existing PE powerpc/eeh: Handle multiple EEH errors powerpc: Fix transactional FP/VMX/VSX unavailable handlers powerpc: Don't corrupt transactional state when using FP/VMX in kernel powerpc: Reclaim two unused thread_info flag bits powerpc: Fix races with irq_work Move precessing of MCE queued event out from syscall exit path. pseries/cpuidle: Remove redundant call to ppc64_runlatch_off() in cpu idle routines powerpc: Make add_system_ram_resources() __init powerpc: add SATA_MV to ppc64_defconfig powerpc/powernv: Increase candidate fw image size powerpc: Add debug checks to catch invalid cpu-to-node mappings powerpc: Fix the setup of CPU-to-Node mappings during CPU online powerpc/iommu: Don't detach device without IOMMU group powerpc/eeh: Hotplug improvement powerpc/eeh: Call opal_pci_reinit() on powernv for restoring config space powerpc/eeh: Add restore_config operation ...
2014-01-27Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc mremap fix from Ben Herrenschmidt: "This is the patch that I had sent after -rc8 and which we decided to wait before merging. It's based on a different tree than my -next branch (it needs some pre-reqs that were in -rc4 or so while my -next is based on -rc1) so I left it as a separate branch for your to pull. It's identical to the request I did 2 or 3 weeks back. This fixes crashes in mremap with THP on powerpc. The fix however requires a small change in the generic code. It moves a condition into a helper we can override from the arch which is harmless, but it *also* slightly changes the order of the set_pmd and the withdraw & deposit, which should be fine according to Kirill (who wrote that code) but I agree -rc8 is a bit late... It was acked by Kirill and Andrew told me to just merge it via powerpc" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/thp: Fix crash on mremap
2014-01-28powerpc: Implement arch_spin_is_locked() using arch_spin_value_unlocked()Michael Ellerman
At a glance these are just the inverse of each other. The one subtlety is that arch_spin_value_unlocked() takes the lock by value, rather than as a pointer, which is important for the lockref code. On the other hand arch_spin_is_locked() doesn't really care, so implement it in terms of arch_spin_value_unlocked(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-28powerpc: Add support for the optimised lockref implementationMichael Ellerman
This commit adds the architecture support required to enable the optimised implementation of lockrefs. That's as simple as defining arch_spin_value_unlocked() and selecting the Kconfig option. We also define cmpxchg64_relaxed(), because the lockref code does not need the cmpxchg to have barrier semantics. Using Linus' test case[1] on one system I see a 4x improvement for the basic enablement, and a further 1.3x for cmpxchg64_relaxed(), for a total of 5.3x vs the baseline. On another system I see more like 2x improvement. [1]: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-27KVM: PPC: Book3S PR: Cope with doorbell interruptsPaul Mackerras
When the PR host is running on a POWER8 machine in POWER8 mode, it will use doorbell interrupts for IPIs. If one of them arrives while we are in the guest, we pop out of the guest with trap number 0xA00, which isn't handled by kvmppc_handle_exit_pr, leading to the following BUG_ON: [ 331.436215] exit_nr=0xa00 | pc=0x1d2c | msr=0x800000000000d032 [ 331.437522] ------------[ cut here ]------------ [ 331.438296] kernel BUG at arch/powerpc/kvm/book3s_pr.c:982! [ 331.439063] Oops: Exception in kernel mode, sig: 5 [#2] [ 331.439819] SMP NR_CPUS=1024 NUMA pSeries [ 331.440552] Modules linked in: tun nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw virtio_net kvm binfmt_misc ibmvscsi scsi_transport_srp scsi_tgt virtio_blk [ 331.447614] CPU: 11 PID: 1296 Comm: qemu-system-ppc Tainted: G D 3.11.7-200.2.fc19.ppc64p7 #1 [ 331.448920] task: c0000003bdc8c000 ti: c0000003bd32c000 task.ti: c0000003bd32c000 [ 331.450088] NIP: d0000000025d6b9c LR: d0000000025d6b98 CTR: c0000000004cfdd0 [ 331.451042] REGS: c0000003bd32f420 TRAP: 0700 Tainted: G D (3.11.7-200.2.fc19.ppc64p7) [ 331.452331] MSR: 800000000282b032 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI> CR: 28004824 XER: 20000000 [ 331.454616] SOFTE: 1 [ 331.455106] CFAR: c000000000848bb8 [ 331.455726] GPR00: d0000000025d6b98 c0000003bd32f6a0 d0000000026017b8 0000000000000032 GPR04: c0000000018627f8 c000000001873208 320d0a3030303030 3030303030643033 GPR08: c000000000c490a8 0000000000000000 0000000000000000 0000000000000002 GPR12: 0000000028004822 c00000000fdc6300 0000000000000000 00000100076ec310 GPR16: 000000002ae343b8 00003ffffd397398 0000000000000000 0000000000000000 GPR20: 00000100076f16f4 00000100076ebe60 0000000000000008 ffffffffffffffff GPR24: 0000000000000000 0000008001041e60 0000000000000000 0000008001040ce8 GPR28: c0000003a2d80000 0000000000000a00 0000000000000001 c0000003a2681810 [ 331.466504] NIP [d0000000025d6b9c] .kvmppc_handle_exit_pr+0x75c/0xa80 [kvm] [ 331.466999] LR [d0000000025d6b98] .kvmppc_handle_exit_pr+0x758/0xa80 [kvm] [ 331.467517] Call Trace: [ 331.467909] [c0000003bd32f6a0] [d0000000025d6b98] .kvmppc_handle_exit_pr+0x758/0xa80 [kvm] (unreliable) [ 331.468553] [c0000003bd32f750] [d0000000025d98f0] kvm_start_lightweight+0xb4/0xc4 [kvm] [ 331.469189] [c0000003bd32f920] [d0000000025d7648] .kvmppc_vcpu_run_pr+0xd8/0x270 [kvm] [ 331.469838] [c0000003bd32f9c0] [d0000000025cf748] .kvmppc_vcpu_run+0xc8/0xf0 [kvm] [ 331.470790] [c0000003bd32fa50] [d0000000025cc19c] .kvm_arch_vcpu_ioctl_run+0x5c/0x1b0 [kvm] [ 331.471401] [c0000003bd32fae0] [d0000000025c4888] .kvm_vcpu_ioctl+0x478/0x730 [kvm] [ 331.472026] [c0000003bd32fc90] [c00000000026192c] .do_vfs_ioctl+0x4dc/0x7a0 [ 331.472561] [c0000003bd32fd80] [c000000000261cc4] .SyS_ioctl+0xd4/0xf0 [ 331.473095] [c0000003bd32fe30] [c000000000009ed8] syscall_exit+0x0/0x98 [ 331.473633] Instruction dump: [ 331.473766] 4bfff9b4 2b9d0800 419efc18 60000000 60420000 3d220000 e8bf11a0 e8df12a8 [ 331.474733] 7fa4eb78 e8698660 48015165 e8410028 <0fe00000> 813f00e4 3ba00000 39290001 [ 331.475386] ---[ end trace 49fc47d994c1f8f2 ]--- [ 331.479817] This fixes the problem by making kvmppc_handle_exit_pr() recognize the interrupt. We also need to jump to the doorbell interrupt handler in book3s_segment.S to handle the interrupt on the way out of the guest. Having done that, there's nothing further to be done in kvmppc_handle_exit_pr(). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27KVM: PPC: Book3S HV: Add software abort codes for transactional memoryMichael Neuling
This adds the software abort code defines for transactional memory (TM). These values are from PAPR. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27KVM: PPC: Book3S HV: Add new state for transactional memoryMichael Neuling
Add new state for transactional memory (TM) to kvm_vcpu_arch. Also add asm-offset bits that are going to be required. This also moves the existing TFHAR, TFIAR and TEXASR SPRs into a CONFIG_PPC_TRANSACTIONAL_MEM section. This requires some code changes to ensure we still compile with CONFIG_PPC_TRANSACTIONAL_MEM=N. Much of the added the added #ifdefs are removed in a later patch when the bulk of the TM code is added. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> [agraf: fix merge conflict] Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27powerpc/Kconfig: Make TM select VSX and VMXMichael Neuling
There are no processors in existence that have TM but no VMX or VSX. So let's makes CONFIG_PPC_TRANSACTIONAL_MEM select both CONFIG_VSX and CONFIG_ALTIVEC. This makes the code a lot simpler by removing the need for a bunch of #ifdefs. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27KVM: PPC: Book3S HV: Basic little-endian guest supportAnton Blanchard
We create a guest MSR from scratch when delivering exceptions in a few places. Instead of extracting LPCR[ILE] and inserting it into MSR_LE each time, we simply create a new variable intr_msr which contains the entire MSR to use. For a little-endian guest, userspace needs to set the ILE (interrupt little-endian) bit in the LPCR for each vcpu (or at least one vcpu in each virtual core). [paulus@samba.org - removed H_SET_MODE implementation from original version of the patch, and made kvmppc_set_lpcr update vcpu->arch.intr_msr.] Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27KVM: PPC: Book3S HV: Add support for DABRX register on POWER7Paul Mackerras
The DABRX (DABR extension) register on POWER7 processors provides finer control over which accesses cause a data breakpoint interrupt. It contains 3 bits which indicate whether to enable accesses in user, kernel and hypervisor modes respectively to cause data breakpoint interrupts, plus one bit that enables both real mode and virtual mode accesses to cause interrupts. Currently, KVM sets DABRX to allow both kernel and user accesses to cause interrupts while in the guest. This adds support for the guest to specify other values for DABRX. PAPR defines a H_SET_XDABR hcall to allow the guest to set both DABR and DABRX with one call. This adds a real-mode implementation of H_SET_XDABR, which shares most of its code with the existing H_SET_DABR implementation. To support this, we add a per-vcpu field to store the DABRX value plus code to get and set it via the ONE_REG interface. For Linux guests to use this new hcall, userspace needs to add "hcall-xdabr" to the set of strings in the /chosen/hypertas-functions property in the device tree. If userspace does this and then migrates the guest to a host where the kernel doesn't include this patch, then userspace will need to implement H_SET_XDABR by writing the specified DABR value to the DABR using the ONE_REG interface. In that case, the old kernel will set DABRX to DABRX_USER | DABRX_KERNEL. That should still work correctly, at least for Linux guests, since Linux guests cope with getting data breakpoint interrupts in modes that weren't requested by just ignoring the interrupt, and Linux guests never set DABRX_BTI. The other thing this does is to make H_SET_DABR and H_SET_XDABR work on POWER8, which has the DAWR and DAWRX instead of DABR/X. Guests that know about POWER8 should use H_SET_MODE rather than H_SET_[X]DABR, but guests running in POWER7 compatibility mode will still use H_SET_[X]DABR. For them, this adds the logic to convert DABR/X values into DAWR/X values on POWER8. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27KVM: PPC: Book3S HV: Prepare for host using hypervisor doorbellsPaul Mackerras
POWER8 has support for hypervisor doorbell interrupts. Though the kernel doesn't use them for IPIs on the powernv platform yet, it probably will in future, so this makes KVM cope gracefully if a hypervisor doorbell interrupt arrives while in a guest. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27KVM: PPC: Book3S HV: Handle new LPCR bits on POWER8Paul Mackerras
POWER8 has a bit in the LPCR to enable or disable the PURR and SPURR registers to count when in the guest. Set this bit. POWER8 has a field in the LPCR called AIL (Alternate Interrupt Location) which is used to enable relocation-on interrupts. Allow userspace to set this field. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27KVM: PPC: Book3S HV: Handle guest using doorbells for IPIsPaul Mackerras
* SRR1 wake reason field for system reset interrupt on wakeup from nap is now a 4-bit field on P8, compared to 3 bits on P7. * Set PECEDP in LPCR when napping because of H_CEDE so guest doorbells will wake us up. * Waking up from nap because of a guest doorbell interrupt is not a reason to exit the guest. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27KVM: PPC: Book3S HV: Consolidate code that checks reason for wake from napPaul Mackerras
Currently in book3s_hv_rmhandlers.S we have three places where we have woken up from nap mode and we check the reason field in SRR1 to see what event woke us up. This consolidates them into a new function, kvmppc_check_wake_reason. It looks at the wake reason field in SRR1, and if it indicates that an external interrupt caused the wakeup, calls kvmppc_read_intr to check what sort of interrupt it was. This also consolidates the two places where we synthesize an external interrupt (0x500 vector) for the guest. Now, if the guest exit code finds that there was an external interrupt which has been handled (i.e. it was an IPI indicating that there is now an interrupt pending for the guest), it jumps to deliver_guest_interrupt, which is in the last part of the guest entry code, where we synthesize guest external and decrementer interrupts. That code has been streamlined a little and now clears LPCR[MER] when appropriate as well as setting it. The extra clearing of any pending IPI on a secondary, offline CPU thread before going back to nap mode has been removed. It is no longer necessary now that we have code to read and acknowledge IPIs in the guest exit path. This fixes a minor bug in the H_CEDE real-mode handling - previously, if we found that other threads were already exiting the guest when we were about to go to nap mode, we would branch to the cede wakeup path and end up looking in SRR1 for a wakeup reason. Now we branch to a point after we have checked the wakeup reason. This also fixes a minor bug in kvmppc_read_intr - previously it could return 0xff rather than 1, in the case where we find that a host IPI is pending after we have cleared the IPI. Now it returns 1. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27KVM: PPC: Book3S HV: Implement architecture compatibility modes for POWER8Paul Mackerras
This allows us to select architecture 2.05 (POWER6) or 2.06 (POWER7) compatibility modes on a POWER8 processor. (Note that transactional memory is disabled for usermode if either or both of the PCR_TM_DIS and PCR_ARCH_206 bits are set.) Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27KVM: PPC: Book3S HV: Add handler for HV facility unavailableMichael Ellerman
At present this should never happen, since the host kernel sets HFSCR to allow access to all facilities. It's better to be prepared to handle it cleanly if it does ever happen, though. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27KVM: PPC: Book3S HV: Flush the correct number of TLB sets on POWER8Paul Mackerras
POWER8 has 512 sets in the TLB, compared to 128 for POWER7, so we need to do more tlbiel instructions when flushing the TLB on POWER8. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27KVM: PPC: Book3S HV: Context-switch new POWER8 SPRsMichael Neuling
This adds fields to the struct kvm_vcpu_arch to store the new guest-accessible SPRs on POWER8, adds code to the get/set_one_reg functions to allow userspace to access this state, and adds code to the guest entry and exit to context-switch these SPRs between host and guest. Note that DPDES (Directed Privileged Doorbell Exception State) is shared between threads on a core; hence we store it in struct kvmppc_vcore and have the master thread save and restore it. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>