summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2013-01-31Merge branch 'x86-efi-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 EFI fixes from Peter Anvin: "This is a collection of fixes for the EFI support. The controversial bit here is a set of patches which bumps the boot protocol version as part of fixing some serious problems with the EFI handover protocol, used when booting under EFI using a bootloader as opposed to directly from EFI. These changes should also make it a lot saner to support cross-mode 32/64-bit EFI booting in the future. Getting these changes into 3.8 means we avoid presenting an inconsistent ABI to bootloaders. Other changes are display detection and fixing efivarfs." * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, efi: remove attribute check from setup_efi_pci x86, build: Dynamically find entry points in compressed startup code x86, efi: Fix PCI ROM handing in EFI boot stub, in 32-bit mode x86, efi: Fix 32-bit EFI handover protocol entry point x86, efi: Fix display detection in EFI boot stub x86, boot: Define the 2.12 bzImage boot protocol x86/boot: Fix minor fd leakage in tools/relocs.c x86, efi: Set runtime_version to the EFI spec revision x86, efi: fix 32-bit warnings in setup_efi_pci() efivarfs: Delete dentry from dcache in efivarfs_file_write() efivarfs: Never return ENOENT from firmware efi, x86: Pass a proper identity mapping in efi_call_phys_prelog efivarfs: Drop link count of the right inode
2013-01-31Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "This is a collection of miscellaneous fixes, the most important one is the fix for the Samsung laptop bricking issue (auto-blacklisting the samsung-laptop driver); the efi_enabled() changes you see below are prerequisites for that fix. The other issues fixed are booting on OLPC XO-1.5, an UV fix, NMI debugging, and requiring CAP_SYS_RAWIO for MSR references, just as with I/O port references." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: samsung-laptop: Disable on EFI hardware efi: Make 'efi_enabled' a function to query EFI facilities smp: Fix SMP function call empty cpu mask race x86/msr: Add capabilities check x86/dma-debug: Bump PREALLOC_DMA_DEBUG_ENTRIES x86/olpc: Fix olpc-xo1-sci.c build errors arch/x86/platform/uv: Fix incorrect tlb flush all issue x86-64: Fix unwind annotations in recent NMI changes x86-32: Start out cr0 clean, disable paging before modifying cr3/4
2013-01-30Merge tag 'efi-for-3.8' into x86/efiH. Peter Anvin
Various urgent EFI fixes and some warning cleanups for v3.8 * EFI boot stub fix for Macbook Pro's from Maarten Lankhorst * Fix an oops in efivarfs from Lingzhu Xiang * 32-bit warning cleanups from Jan Beulich * Patch to Boot on >512GB RAM systems from Nathan Zimmer * Set efi.runtime_version correctly * efivarfs updates Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-30efi: Make 'efi_enabled' a function to query EFI facilitiesMatt Fleming
Originally 'efi_enabled' indicated whether a kernel was booted from EFI firmware. Over time its semantics have changed, and it now indicates whether or not we are booted on an EFI machine with bit-native firmware, e.g. 64-bit kernel with 64-bit firmware. The immediate motivation for this patch is the bug report at, https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557 which details how running a platform driver on an EFI machine that is designed to run under BIOS can cause the machine to become bricked. Also, the following report, https://bugzilla.kernel.org/show_bug.cgi?id=47121 details how running said driver can also cause Machine Check Exceptions. Drivers need a new means of detecting whether they're running on an EFI machine, as sadly the expression, if (!efi_enabled) hasn't been a sufficient condition for quite some time. Users actually want to query 'efi_enabled' for different reasons - what they really want access to is the list of available EFI facilities. For instance, the x86 reboot code needs to know whether it can invoke the ResetSystem() function provided by the EFI runtime services, while the ACPI OSL code wants to know whether the EFI config tables were mapped successfully. There are also checks in some of the platform driver code to simply see if they're running on an EFI machine (which would make it a bad idea to do BIOS-y things). This patch is a prereq for the samsung-laptop fix patch. Cc: David Airlie <airlied@linux.ie> Cc: Corentin Chary <corentincj@iksaif.net> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Olof Johansson <olof@lixom.net> Cc: Peter Jones <pjones@redhat.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Steve Langasek <steve.langasek@canonical.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Konrad Rzeszutek Wilk <konrad@kernel.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29x86, efi: remove attribute check from setup_efi_pciMaarten Lankhorst
It looks like the original commit that copied the rom contents from efi always copied the rom, and the fixup in setup_efi_pci from commit 886d751a2ea99a160 ("x86, efi: correct precedence of operators in setup_efi_pci") broke that. This resulted in macbook pro's no longer finding the rom images, and thus not being able to use the radeon card any more. The solution is to just remove the check for now, and always copy the rom if available. Reported-by: Vitaly Budovski <vbudovski+news@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Seth Forshee <seth.forshee@canonical.com> Acked-by: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-01-27x86, build: Dynamically find entry points in compressed startup codeDavid Woodhouse
We have historically hard-coded entry points in head.S just so it's easy to build the executable/bzImage headers with references to them. Unfortunately, this leads to boot loaders abusing these "known" addresses even when they are *explicitly* told that they "should look at the ELF header to find this address, as it may change in the future". And even when the address in question *has* actually been changed in the past, without fanfare or thought to compatibility. Thus we have bootloaders doing stunningly broken things like jumping to offset 0x200 in the kernel startup code in 64-bit mode, *hoping* that startup_64 is still there (it has moved at least once before). And hoping that it's actually a 64-bit kernel despite the fact that we don't give them any indication of that fact. This patch should hopefully remove the temptation to abuse internal addresses in future, where sternly worded comments have not sufficed. Instead of having hard-coded addresses and saying "please don't abuse these", we actually pull the addresses out of the ELF payload into zoffset.h, and make build.c shove them back into the right places in the bzImage header. Rather than including zoffset.h into build.c and thus having to rebuild the tool for every kernel build, we parse it instead. The parsing code is small and simple. This patch doesn't actually move any of the interesting entry points, so any offending bootloader will still continue to "work" after this patch is applied. For some version of "work" which includes jumping into the compressed payload and crashing, if the bzImage it's given is a 32-bit kernel. No change there then. [ hpa: some of the issues in the description are addressed or retconned by the 2.12 boot protocol. This patch has been edited to only remove fixed addresses that were *not* thus retconned. ] Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Link: http://lkml.kernel.org/r/1358513837.2397.247.camel@shinybook.infradead.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Matt Fleming <matt.fleming@intel.com>
2013-01-27x86, efi: Fix PCI ROM handing in EFI boot stub, in 32-bit modeDavid Woodhouse
The 'Attributes' argument to pci->Attributes() function is 64-bit. So when invoking in 32-bit mode it takes two registers, not just one. This fixes memory corruption when booting via the 32-bit EFI boot stub. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/1358513837.2397.247.camel@shinybook.infradead.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Matt Fleming <matt.fleming@intel.com>
2013-01-27x86, efi: Fix 32-bit EFI handover protocol entry pointDavid Woodhouse
If the bootloader calls the EFI handover entry point as a standard function call, then it'll have a return address on the stack. We need to pop that before calling efi_main(), or the arguments will all be out of position on the stack. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/1358513837.2397.247.camel@shinybook.infradead.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Matt Fleming <matt.fleming@intel.com>
2013-01-27x86, efi: Fix display detection in EFI boot stubDavid Woodhouse
When booting under OVMF we have precisely one GOP device, and it implements the ConOut protocol. We break out of the loop when we look at it... and then promptly abort because 'first_gop' never gets set. We should set first_gop *before* breaking out of the loop. Yes, it doesn't really mean "first" any more, but that doesn't matter. It's only a flag to indicate that a suitable GOP was found. In fact, we'd do just as well to initialise 'width' to zero in this function, then just check *that* instead of first_gop. But I'll do the minimal fix for now (and for stable@). Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/1358513837.2397.247.camel@shinybook.infradead.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Matt Fleming <matt.fleming@intel.com>
2013-01-27x86, boot: Define the 2.12 bzImage boot protocolH. Peter Anvin
Define the 2.12 bzImage boot protocol: add xloadflags and additional fields to allow the command line, initramfs and struct boot_params to live above the 4 GiB mark. The xloadflags now communicates if this is a 64-bit kernel with the legacy 64-bit entry point and which of the EFI handover entry points are supported. Avoid adding new read flags to loadflags because of claimed bootloaders testing the whole byte for == 1 to determine bzImageness at least until the issue can be researched further. This is based on patches by Yinghai Lu and David Woodhouse. Originally-by: Yinghai Lu <yinghai@kernel.org> Originally-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1359058816-7615-26-git-send-email-yinghai@kernel.org Cc: Rob Landley <rob@landley.net> Cc: Gokul Caushik <caushik1@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joe Millenbach <jmillenbach@gmail.com>
2013-01-27x86/boot: Fix minor fd leakage in tools/relocs.cCong Ding
The opened file should be closed. Signed-off-by: Cong Ding <dinggnu@gmail.com> Cc: Kusanagi Kouichi <slash@ac.auone-net.jp> Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Matt Fleming <matt.fleming@intel.com> Link: http://lkml.kernel.org/r/1358183628-27784-1-git-send-email-dinggnu@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-01-25x86, efi: Set runtime_version to the EFI spec revisionMatt Fleming
efi.runtime_version is erroneously being set to the value of the vendor's firmware revision instead of that of the implemented EFI specification. We can't deduce which EFI functions are available based on the revision of the vendor's firmware since the version scheme is likely to be unique to each vendor. What we really need to know is the revision of the implemented EFI specification, which is available in the EFI System Table header. Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: stable@vger.kernel.org # 3.7.x Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-01-25x86, efi: fix 32-bit warnings in setup_efi_pci()Jan Beulich
Fix four similar build warnings on 32-bit (casts between different size pointers and integers). Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Stefan Hasko <hasko.stevo@gmail.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-01-24x86/msr: Add capabilities checkAlan Cox
At the moment the MSR driver only relies upon file system checks. This means that anything as root with any capability set can write to MSRs. Historically that wasn't very interesting but on modern processors the MSRs are such that writing to them provides several ways to execute arbitary code in kernel space. Sample code and documentation on doing this is circulating and MSR attacks are used on Windows 64bit rootkits already. In the Linux case you still need to be able to open the device file so the impact is fairly limited and reduces the security of some capability and security model based systems down towards that of a generic "root owns the box" setup. Therefore they should require CAP_SYS_RAWIO to prevent an elevation of capabilities. The impact of this is fairly minimal on most setups because they don't have heavy use of capabilities. Those using SELinux, SMACK or AppArmor rules might want to consider if their rulesets on the MSR driver could be tighter. Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Horses <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-01-24x86/dma-debug: Bump PREALLOC_DMA_DEBUG_ENTRIESMaarten Lankhorst
I ran out of free entries when I had CONFIG_DMA_API_DEBUG enabled. Some other archs seem to default to 65536, so increase this limit for x86 too. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: http://lkml.kernel.org/r/50A612AA.7040206@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org> ----
2013-01-24x86/olpc: Fix olpc-xo1-sci.c build errorsRandy Dunlap
Fix build errors when CONFIG_INPUT=m. This is not pretty, but all of the OLPC kconfig options are bool instead of tristate. arch/x86/built-in.o: In function `send_lid_state': olpc-xo1-sci.c:(.text+0x1d323): undefined reference to `input_event' olpc-xo1-sci.c:(.text+0x1d338): undefined reference to `input_event' ... In the long run, fixing this driver kconfig to be tristate instead of bool would be a very good change. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Andres Salomon <dilinger@queued.net> Cc: Chris Ball <cjb@laptop.org> Cc: Jon Nettleton <jon.nettleton@gmail.com> Cc: Daniel Drake <dsd@laptop.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-01-24arch/x86/platform/uv: Fix incorrect tlb flush all issueAlex Shi
The flush tlb optimization code has logical issue on UV platform. It doesn't flush the full range at all, since it simply ignores its 'end' parameter (and hence also the "all" indicator) in uv_flush_tlb_others() function. Cliff's notes: | I tested the patch on a UV. It has the effect of either | clearing 1 or all TLBs in a cpu. I added some debugging to | test for the cases when clearing all TLBs is overkill, and in | practice it happens very seldom. Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Cliff Wickman <cpw@sgi.com> Tested-by: Cliff Wickman <cpw@sgi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-01-24x86-64: Fix unwind annotations in recent NMI changesJan Beulich
While in one case a plain annotation is necessary, in the other case the stack adjustment can simply be folded into the immediately preceding RESTORE_ALL, thus getting the correct annotation for free. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alexander van Heukelum <heukelum@mailshack.com> Link: http://lkml.kernel.org/r/51010C9302000078000B9045@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-01-22Merge tag 'perf-urgent-for-mingo' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf/urgent fixes from Arnaldo Carvalho de Melo: . revert 20b279 - require exclude_guest to use PEBS - kernel side, now older binaries will continue working for things like cycles:pp without needing to pass extra modifiers, from David Ahern. . Fix building from 'make perf-*-src-pkg' tarballs, broken by UAPI, from Sebastian Andrzej Siewior [ Pulling directly, Ingo would normally pull but has been unresponsive ] * tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf tools: Fix building from 'make perf-*-src-pkg' tarballs perf x86: revert 20b279 - require exclude_guest to use PEBS - kernel side
2013-01-22ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILLOleg Nesterov
putreg() assumes that the tracee is not running and pt_regs_access() can safely play with its stack. However a killed tracee can return from ptrace_stop() to the low-level asm code and do RESTORE_REST, this means that debugger can actually read/modify the kernel stack until the tracee does SAVE_REST again. set_task_blockstep() can race with SIGKILL too and in some sense this race is even worse, the very fact the tracee can be woken up breaks the logic. As Linus suggested we can clear TASK_WAKEKILL around the arch_ptrace() call, this ensures that nobody can ever wakeup the tracee while the debugger looks at it. Not only this fixes the mentioned problems, we can do some cleanups/simplifications in arch_ptrace() paths. Probably ptrace_unfreeze_traced() needs more callers, for example it makes sense to make the tracee killable for oom-killer before access_process_vm(). While at it, add the comment into may_ptrace_stop() to explain why ptrace_stop() still can't rely on SIGKILL and signal_pending_state(). Reported-by: Salman Qazi <sqazi@google.com> Reported-by: Suleiman Souhlal <suleiman@google.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-19x86-32: Start out cr0 clean, disable paging before modifying cr3/4H. Peter Anvin
Patch 5a5a51db78e x86-32: Start out eflags and cr4 clean ... made x86-32 match x86-64 in that we initialize %eflags and %cr4 from scratch. This broke OLPC XO-1.5, because the XO enters the kernel with paging enabled, which the kernel doesn't expect. Since we no longer support 386 (the source of most of the variability in %cr0 configuration), we can simply match further x86-64 and initialize %cr0 to a fixed value -- the one variable part remaining in %cr0 is for FPU control, but all that is handled later on in initialization; in particular, configuring %cr0 as if the FPU is present until proven otherwise is correct and necessary for the probe to work. To deal with the XO case sanely, explicitly disable paging in %cr0 before we muck with %cr3, %cr4 or EFER -- those operations are inherently unsafe with paging enabled. NOTE: There is still a lot of 386-related junk in head_32.S which we can and should get rid of, however, this is intended as a minimal fix whereas the cleanup can be deferred to the next merge window. Reported-by: Andres Salomon <dilinger@queued.net> Tested-by: Daniel Drake <dsd@laptop.org> Link: http://lkml.kernel.org/r/50FA0661.2060400@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-18Merge tag 'stable/for-linus-3.8-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen fixes from Konrad Rzeszutek Wilk: - CVE-2013-0190/XSA-40 (or stack corruption for 32-bit PV kernels) - Fix racy vma access spotted by Al Viro - Fix mmap batch ioctl potentially resulting in large O(n) page allcations. - Fix vcpu online/offline BUG:scheduling while atomic.. - Fix unbound buffer scanning for more than 32 vCPUs. - Fix grant table being incorrectly initialized - Fix incorrect check in pciback - Allow privcmd in backend domains. Fix up whitespace conflict due to ugly merge resolution in Xen tree in arch/arm/xen/enlighten.c * tag 'stable/for-linus-3.8-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: Fix stack corruption in xen_failsafe_callback for 32bit PVOPS guests. Revert "xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic." xen/gntdev: remove erronous use of copy_to_user xen/gntdev: correctly unmap unlinked maps in mmu notifier xen/gntdev: fix unsafe vma access xen/privcmd: Fix mmap batch ioctl. Xen: properly bound buffer access when parsing cpu/*/availability xen/grant-table: correctly initialize grant table version 1 x86/xen : Fix the wrong check in pciback xen/privcmd: Relax access control in privcmd_ioctl_mmap
2013-01-18efi, x86: Pass a proper identity mapping in efi_call_phys_prelogNathan Zimmer
Update efi_call_phys_prelog to install an identity mapping of all available memory. This corrects a bug on very large systems with more then 512 GB in which bios would not be able to access addresses above not in the mapping. The result is a crash that looks much like this. BUG: unable to handle kernel paging request at 000000effd870020 IP: [<0000000078bce331>] 0x78bce330 PGD 0 Oops: 0000 [#1] SMP Modules linked in: CPU 0 Pid: 0, comm: swapper/0 Tainted: G W 3.8.0-rc1-next-20121224-medusa_ntz+ #2 Intel Corp. Stoutland Platform RIP: 0010:[<0000000078bce331>] [<0000000078bce331>] 0x78bce330 RSP: 0000:ffffffff81601d28 EFLAGS: 00010006 RAX: 0000000078b80e18 RBX: 0000000000000004 RCX: 0000000000000004 RDX: 0000000078bcf958 RSI: 0000000000002400 RDI: 8000000000000000 RBP: 0000000078bcf760 R08: 000000effd870000 R09: 0000000000000000 R10: 0000000000000000 R11: 00000000000000c3 R12: 0000000000000030 R13: 000000effd870000 R14: 0000000000000000 R15: ffff88effd870000 FS: 0000000000000000(0000) GS:ffff88effe400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000effd870020 CR3: 000000000160c000 CR4: 00000000000006b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper/0 (pid: 0, threadinfo ffffffff81600000, task ffffffff81614400) Stack: 0000000078b80d18 0000000000000004 0000000078bced7b ffff880078b81fff 0000000000000000 0000000000000082 0000000078bce3a8 0000000000002400 0000000060000202 0000000078b80da0 0000000078bce45d ffffffff8107cb5a Call Trace: [<ffffffff8107cb5a>] ? on_each_cpu+0x77/0x83 [<ffffffff8102f4eb>] ? change_page_attr_set_clr+0x32f/0x3ed [<ffffffff81035946>] ? efi_call4+0x46/0x80 [<ffffffff816c5abb>] ? efi_enter_virtual_mode+0x1f5/0x305 [<ffffffff816aeb24>] ? start_kernel+0x34a/0x3d2 [<ffffffff816ae5ed>] ? repair_env_string+0x60/0x60 [<ffffffff816ae2be>] ? x86_64_start_reservations+0xba/0xc1 [<ffffffff816ae120>] ? early_idt_handlers+0x120/0x120 [<ffffffff816ae419>] ? x86_64_start_kernel+0x154/0x163 Code: Bad RIP value. RIP [<0000000078bce331>] 0x78bce330 RSP <ffffffff81601d28> CR2: 000000effd870020 ---[ end trace ead828934fef5eab ]--- Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com> Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-01-16xen: Fix stack corruption in xen_failsafe_callback for 32bit PVOPS guests.Andrew Cooper
This fixes CVE-2013-0190 / XSA-40 There has been an error on the xen_failsafe_callback path for failed iret, which causes the stack pointer to be wrong when entering the iret_exc error path. This can result in the kernel crashing. In the classic kernel case, the relevant code looked a little like: popl %eax # Error code from hypervisor jz 5f addl $16,%esp jmp iret_exc # Hypervisor said iret fault 5: addl $16,%esp # Hypervisor said segment selector fault Here, there are two identical addls on either option of a branch which appears to have been optimised by hoisting it above the jz, and converting it to an lea, which leaves the flags register unaffected. In the PVOPS case, the code looks like: popl_cfi %eax # Error from the hypervisor lea 16(%esp),%esp # Add $16 before choosing fault path CFI_ADJUST_CFA_OFFSET -16 jz 5f addl $16,%esp # Incorrectly adjust %esp again jmp iret_exc It is possible unprivileged userspace applications to cause this behaviour, for example by loading an LDT code selector, then changing the code selector to be not-present. At this point, there is a race condition where it is possible for the hypervisor to return back to userspace from an interrupt, fault on its own iret, and inject a failsafe_callback into the kernel. This bug has been present since the introduction of Xen PVOPS support in commit 5ead97c84 (xen: Core Xen implementation), in 2.6.23. Signed-off-by: Frediano Ziglio <frediano.ziglio@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Cc: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-16Merge branch 'x86/urgent' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "This is mainly a workaround for a bug in Sandy Bridge graphics which causes corruption of certain memory pages." * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/Sandy Bridge: Sandy Bridge workaround depends on CONFIG_PCI x86/Sandy Bridge: mark arrays in __init functions as __initconst x86/Sandy Bridge: reserve pages when integrated graphics is present x86, efi: correct precedence of operators in setup_efi_pci
2013-01-15Revert "xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling ↵Konrad Rzeszutek Wilk
while atomic." This reverts commit 41bd956de3dfdc3a43708fe2e0c8096c69064a1e. The fix is incorrect and not appropiate for the latest kernels. In fact it _causes_ the BUG: scheduling while atomic while doing vCPU hotplug. Suggested-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-15Merge tag 'v3.7' into stable/for-linus-3.8Konrad Rzeszutek Wilk
Linux 3.7 * tag 'v3.7': (833 commits) Linux 3.7 Input: matrix-keymap - provide proper module license Revert "revert "Revert "mm: remove __GFP_NO_KSWAPD""" and associated damage ipv4: ip_check_defrag must not modify skb before unsharing Revert "mm: avoid waking kswapd for THP allocations when compaction is deferred or contended" inet_diag: validate port comparison byte code to prevent unsafe reads inet_diag: avoid unsafe and nonsensical prefix matches in inet_diag_bc_run() inet_diag: validate byte code to prevent oops in inet_diag_bc_run() inet_diag: fix oops for IPv4 AF_INET6 TCP SYN-RECV state mm: vmscan: fix inappropriate zone congestion clearing vfs: fix O_DIRECT read past end of block device net: gro: fix possible panic in skb_gro_receive() tcp: bug fix Fast Open client retransmission tmpfs: fix shared mempolicy leak mm: vmscan: do not keep kswapd looping forever due to individual uncompactable zones mm: compaction: validate pfn range passed to isolate_freepages_block mmc: sh-mmcif: avoid oops on spurious interrupts (second try) Revert misapplied "mmc: sh-mmcif: avoid oops on spurious interrupts" mmc: sdhci-s3c: fix missing clock for gpio card-detect lib/Makefile: Fix oid_registry build dependency ... Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Conflicts: arch/arm/xen/enlighten.c drivers/xen/Makefile [We need to have the v3.7 base as the 'for-3.8' was based off v3.7-rc3 and there are some patches in v3.7-rc6 that we to have in our branch]
2013-01-13x86/Sandy Bridge: Sandy Bridge workaround depends on CONFIG_PCIH. Peter Anvin
early_pci_allowed() and read_pci_config_16() are only available if CONFIG_PCI is defined. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
2013-01-13x86/Sandy Bridge: mark arrays in __init functions as __initconstH. Peter Anvin
Mark static arrays as __initconst so they get removed when the init sections are flushed. Reported-by: Mathias Krause <minipli@googlemail.com> Link: http://lkml.kernel.org/r/75F4BEE6-CB0E-4426-B40B-697451677738@googlemail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-11x86/Sandy Bridge: reserve pages when integrated graphics is presentJesse Barnes
SNB graphics devices have a bug that prevent them from accessing certain memory ranges, namely anything below 1M and in the pages listed in the table. So reserve those at boot if set detect a SNB gfx device on the CPU to avoid GPU hangs. Stephane Marchesin had a similar patch to the page allocator awhile back, but rather than reserving pages up front, it leaked them at allocation time. [ hpa: made a number of stylistic changes, marked arrays as static const, and made less verbose; use "memblock=debug" for full verbosity. ] Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-10Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM bugfixes from Marcelo Tosatti. * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: use dynamic percpu allocations for shared msrs area KVM: PPC: Book3S HV: Fix compilation without CONFIG_PPC_POWERNV powerpc: Corrected include header path in kvm_para.h Add rcu user eqs exception hooks for async page fault
2013-01-10perf x86: revert 20b279 - require exclude_guest to use PEBS - kernel sideDavid Ahern
This patch is brought to you by the letter 'H'. Commit 20b279 breaks compatiblity with older perf binaries when run with precise modifier (:p or :pp) by requiring the exclude_guest attribute to be set. Older binaries default exclude_guest to 0 (ie., wanting guest-based samples) unless host only profiling is requested (:H modifier). The workaround for older binaries is to add H to the modifier list (e.g., -e cycles:ppH - toggles exclude_guest to 1). This was deemed unacceptable by Linus: https://lkml.org/lkml/2012/12/12/570 Between family in town and the fresh snow in Breckenridge there is no time left to be working on the proper fix for this over the holidays. In the New Year I have more pressing problems to resolve -- like some memory leaks in perf which are proving to be elusive -- although the aforementioned snow is probably why they are proving to be elusive. Either way I do not have any spare time to work on this and from the time I have managed to spend on it the solution is more difficult than just moving to a new exclude_guest flag (does not work) or flipping the logic to include_guest (which is not as trivial as one would think). So, two options: silently force exclude_guest on as suggested by Gleb which means no impact to older perf binaries or revert the original patch which caused the breakage. This patch does the latter -- reverts the original patch that introduced the regression. The problem can be revisited in the future as time allows. Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Avi Kivity <avi@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Link: http://lkml.kernel.org/r/1356749767-17322-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-08KVM: x86: use dynamic percpu allocations for shared msrs areaMarcelo Tosatti
Use dynamic percpu allocations for the shared msrs structure, to avoid using the limited reserved percpu space. Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-01-03X86: drivers: remove __dev* attributes.Greg Kroah-Hartman
CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, __devexit_p, __devinitconst, and __devexit from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Daniel Drake <dsd@laptop.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-12-26PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)Myron Stowe
Commit 284f5f9 was intended to disable the "only_one_child()" optimization on Stratus ftServer systems, but its DMI check is wrong. It looks for DMI_SYS_VENDOR that contains "ftServer", when it should look for DMI_SYS_VENDOR containing "Stratus" and DMI_PRODUCT_NAME containing "ftServer". Tested on Stratus ftServer 6400. Reported-by: Fadeeva Marina <astarta@rat.ru> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=51331 Signed-off-by: Myron Stowe <myron.stowe@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org # v3.5+
2012-12-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull signal handling cleanups from Al Viro: "sigaltstack infrastructure + conversion for x86, alpha and um, COMPAT_SYSCALL_DEFINE infrastructure. Note that there are several conflicts between "unify SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline; resolution is trivial - just remove definitions of SS_ONSTACK and SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and include/uapi/linux/signal.h contains the unified variant." Fixed up conflicts as per Al. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: alpha: switch to generic sigaltstack new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those generic compat_sys_sigaltstack() introduce generic sys_sigaltstack(), switch x86 and um to it new helper: compat_user_stack_pointer() new helper: restore_altstack() unify SS_ONSTACK/SS_DISABLE definitions new helper: current_user_stack_pointer() missing user_stack_pointer() instances Bury the conditionals from kernel_thread/kernel_execve series COMPAT_SYSCALL_DEFINE: infrastructure
2012-12-20x86, efi: correct precedence of operators in setup_efi_pciSasha Levin
With the current code, the condition in the if() doesn't make much sense due to precedence of operators. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Link: http://lkml.kernel.org/r/1356030701-16284-25-git-send-email-sasha.levin@oracle.com Cc: Matt Fleming <matt.fleming@intel.com> Cc: Matthew Garrett <mjg@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-12-20Merge tag 'iommu-updates-v3.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "A few new features this merge-window. The most important one is probably, that dma-debug now warns if a dma-handle is not checked with dma_mapping_error by the device driver. This requires minor changes to some architectures which make use of dma-debug. Most of these changes have the respective Acks by the Arch-Maintainers. Besides that there are updates to the AMD IOMMU driver for refactor the IOMMU-Groups support and to make sure it does not trigger a hardware erratum. The OMAP changes (for which I pulled in a branch from Tony Lindgren's tree) have a conflict in linux-next with the arm-soc tree. The conflict is in the file arch/arm/mach-omap2/clock44xx_data.c which is deleted in the arm-soc tree. It is safe to delete the file too so solve the conflict. Similar changes are done in the arm-soc tree in the common clock framework migration. A missing hunk from the patch in the IOMMU tree will be submitted as a seperate patch when the merge-window is closed." * tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (29 commits) ARM: dma-mapping: support debug_dma_mapping_error ARM: OMAP4: hwmod data: ipu and dsp to use parent clocks instead of leaf clocks iommu/omap: Adapt to runtime pm iommu/omap: Migrate to hwmod framework iommu/omap: Keep mmu enabled when requested iommu/omap: Remove redundant clock handling on ISR iommu/amd: Remove obsolete comment iommu/amd: Don't use 512GB pages iommu/tegra: smmu: Move bus_set_iommu after probe for multi arch iommu/tegra: gart: Move bus_set_iommu after probe for multi arch iommu/tegra: smmu: Remove unnecessary PTC/TLB flush all tile: dma_debug: add debug_dma_mapping_error support sh: dma_debug: add debug_dma_mapping_error support powerpc: dma_debug: add debug_dma_mapping_error support mips: dma_debug: add debug_dma_mapping_error support microblaze: dma-mapping: support debug_dma_mapping_error ia64: dma_debug: add debug_dma_mapping_error support c6x: dma_debug: add debug_dma_mapping_error support ARM64: dma_debug: add debug_dma_mapping_error support intel-iommu: Prevent devices with RMRRs from being placed into SI Domain ...
2012-12-19new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to thoseAl Viro
note that they are relying on access_ok() already checked by caller. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19generic compat_sys_sigaltstack()Al Viro
Again, conditional on CONFIG_GENERIC_SIGALTSTACK Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19introduce generic sys_sigaltstack(), switch x86 and um to itAl Viro
Conditional on CONFIG_GENERIC_SIGALTSTACK; architectures that do not select it are completely unaffected Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19new helper: compat_user_stack_pointer()Al Viro
Compat counterpart of current_user_stack_pointer(); for most of the biarch architectures those two are identical, but e.g. arm64 and arm use different registers for stack pointer... Note that amd64 variants of current_user_stack_pointer/compat_user_stack_pointer do *not* rely on pt_regs having been through FIXUP_TOP_OF_STACK. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19unify SS_ONSTACK/SS_DISABLE definitionsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19missing user_stack_pointer() instancesAl Viro
for the architectures that have usp in pt_regs and do not have user_stack_pointer() already defined. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19Bury the conditionals from kernel_thread/kernel_execve seriesAl Viro
All architectures have CONFIG_GENERIC_KERNEL_THREAD CONFIG_GENERIC_KERNEL_EXECVE __ARCH_WANT_SYS_EXECVE None of them have __ARCH_WANT_KERNEL_EXECVE and there are only two callers of kernel_execve() (which is a trivial wrapper for do_execve() now) left. Kill the conditionals and make both callers use do_execve(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19Merge branch 'x86/nuke386' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull one final 386 removal patch from Peter Anvin. IRQ 13 FPU error handling is gone. That was not one of the proudest moments in PC history. * 'x86/nuke386' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, 386 removal: Remove support for IRQ 13 FPU error reporting
2012-12-19Merge tag 'modules-next-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module update from Rusty Russell: "Nothing all that exciting; a new module-from-fd syscall for those who want to verify the source of the module (ChromeOS) and/or use standard IMA on it or other security hooks." * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: MODSIGN: Fix kbuild output when using default extra_certificates MODSIGN: Avoid using .incbin in C source modules: don't hand 0 to vmalloc. module: Remove a extra null character at the top of module->strtab. ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants ASN.1: Define indefinite length marker constant moduleparam: use __UNIQUE_ID() __UNIQUE_ID() MODSIGN: Add modules_sign make target powerpc: add finit_module syscall. ima: support new kernel module syscall add finit_module syscall to asm-generic ARM: add finit_module syscall to ARM security: introduce kernel_module_from_file hook module: add flags arg to sys_finit_module() module: add syscall to load module from fd
2012-12-18Merge branch 'akpm' (more patches from Andrew)Linus Torvalds
Merge patches from Andrew Morton: "Most of the rest of MM, plus a few dribs and drabs. I still have quite a few irritating patches left around: ones with dubious testing results, lack of review, ones which should have gone via maintainer trees but the maintainers are slack, etc. I need to be more activist in getting these things wrapped up outside the merge window, but they're such a PITA." * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (48 commits) mm/vmscan.c: avoid possible deadlock caused by too_many_isolated() vmscan: comment too_many_isolated() mm/kmemleak.c: remove obsolete simple_strtoul mm/memory_hotplug.c: improve comments mm/hugetlb: create hugetlb cgroup file in hugetlb_init mm/mprotect.c: coding-style cleanups Documentation: ABI: /sys/devices/system/node/ slub: drop mutex before deleting sysfs entry memcg: add comments clarifying aspects of cache attribute propagation kmem: add slab-specific documentation about the kmem controller slub: slub-specific propagation changes slab: propagate tunable values memcg: aggregate memcg cache values in slabinfo memcg/sl[au]b: shrink dead caches memcg/sl[au]b: track all the memcg children of a kmem_cache memcg: destroy memcg caches sl[au]b: allocate objects from memcg cache sl[au]b: always get the cache from its page in kmem_cache_free() memcg: skip memcg kmem allocations in specified code regions memcg: infrastructure to match an allocation to the right cache ...
2012-12-18arch/x86/platform/iris/iris.c: register a platform device and a platform driverShérab
This makes the iris driver use the platform API, so it is properly exposed in /sys. [akpm@linux-foundation.org: remove commented-out code, add missing space to printk, clean up code layout] Signed-off-by: Shérab <Sebastien.Hinderer@ens-lyon.org> Cc: Len Brown <lenb@kernel.org> Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-18Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull powertool update from Len Brown: "This updates the tree w/ the latest version of turbostat, which reports temperature and - on SNB and later - Watts." Fix up semantic merge conflict as per Len. * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools: Allow tools to be installed in a user specified location tools/power: turbostat: make Makefile a bit more capable tools/power x86_energy_perf_policy: close /proc/stat in for_every_cpu() tools/power turbostat: v3.0: monitor Watts and Temperature tools/power turbostat: fix output buffering issue tools/power turbostat: prevent infinite loop on migration error path x86 power: define RAPL MSRs tools/power/x86/turbostat: share kernel MSR #defines