summaryrefslogtreecommitdiffstats
path: root/arch/x86 (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2014-09-276-47/+61
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "This has: - EFI revert to fix a boot regression - early_ioremap() fix for boot failure - KASLR fix for possible boot failures - EFI fix for corrupted string printing - remove a misleading EFI bootup 'failed!' error message Unfortunately it's all rather close to the merge window" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efi: Truncate 64-bit values when calling 32-bit OutputString() x86/efi: Delete misleading efi_printk() error message Revert "efi/x86: efistub: Move shared dependencies to <asm/efi.h>" x86/kaslr: Avoid the setup_data area when picking location x86 early_ioremap: Increase FIX_BTMAPS_SLOTS to 8
| * Merge tag 'efi-urgent' of ↵Ingo Molnar2014-09-254-44/+43
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent Pull EFI fixes from Matt Fleming: * Revert the static library changes from the merge window since they're causing issues for Macbooks and Fedora + Grub2 (Matt Fleming) * Delete the misleading "setup_efi_pci() failed!" message which some people are seeing when booting EFI (Matt Fleming) * Fix printing strings from the 32-bit EFI boot stub by only passing 32-bit addresses to the firmware (Matt Fleming) Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * x86/efi: Truncate 64-bit values when calling 32-bit OutputString()Matt Fleming2014-09-241-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we're executing the 32-bit efi_char16_printk() code path (i.e. running on top of 32-bit firmware) we know that efi_early->text_output will be a 32-bit value, even though ->text_output has type u64. Unfortunately, we currently pass ->text_output directly to efi_early->call() so for CONFIG_X86_32 the compiler will push a 64-bit value onto the stack, causing the other parameters to be misaligned. The way we handle this in the rest of the EFI boot stub is to pass pointers as arguments to efi_early->call(), which automatically do the right thing (pointers are 32-bit on CONFIG_X86_32, and we simply ignore the upper 32-bits of the argument register if running in 64-bit mode with 32-bit firmware). This fixes a corruption bug when printing strings from the 32-bit EFI boot stub. Link: https://bugzilla.kernel.org/show_bug.cgi?id=84241 Signed-off-by: Matt Fleming <matt.fleming@intel.com>
| | * x86/efi: Delete misleading efi_printk() error messageMatt Fleming2014-09-241-15/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A number of people are reporting seeing the "setup_efi_pci() failed!" error message in what used to be a quiet boot, https://bugzilla.kernel.org/show_bug.cgi?id=81891 The message isn't all that helpful because setup_efi_pci() can return a non-success error code for a variety of reasons, not all of them fatal. Let's drop the return code from setup_efi_pci*() altogether, since there's no way to process it in any meaningful way outside of the inner __setup_efi_pci*() functions. Reported-by: Darren Hart <dvhart@linux.intel.com> Reported-by: Josh Boyer <jwboyer@fedoraproject.org> Cc: Ulf Winkelvos <ulf@winkelvos.de> Cc: Andre Müller <andre.muller@web.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
| | * Revert "efi/x86: efistub: Move shared dependencies to <asm/efi.h>"Matt Fleming2014-09-234-27/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit f23cf8bd5c1f ("efi/x86: efistub: Move shared dependencies to <asm/efi.h>") as well as the x86 parts of commit f4f75ad5741f ("efi: efistub: Convert into static library"). The road leading to these two reverts is long and winding. The above two commits were merged during the v3.17 merge window and turned the common EFI boot stub code into a static library. This necessitated making some symbols global in the x86 boot stub which introduced new entries into the early boot GOT. The problem was that we weren't fixing up the newly created GOT entries before invoking the EFI boot stub, which sometimes resulted in hangs or resets. This failure was reported by Maarten on his Macbook pro. The proposed fix was commit 9cb0e394234d ("x86/efi: Fixup GOT in all boot code paths"). However, that caused issues for Linus when booting his Sony Vaio Pro 11. It was subsequently reverted in commit f3670394c29f. So that leaves us back with Maarten's Macbook pro not booting. At this stage in the release cycle the least risky option is to revert the x86 EFI boot stub to the pre-merge window code structure where we explicitly #include efi-stub-helper.c instead of linking with the static library. The arm64 code remains unaffected. We can take another swing at the x86 parts for v3.18. Conflicts: arch/x86/include/asm/efi.h Tested-by: Josh Boyer <jwboyer@fedoraproject.org> Tested-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Tested-by: Leif Lindholm <leif.lindholm@linaro.org> [arm64] Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>, Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
| * | x86/kaslr: Avoid the setup_data area when picking locationKees Cook2014-09-191-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The KASLR location-choosing logic needs to avoid the setup_data list memory areas as well. Without this, it would be possible to have the ASLR position stomp on the memory, ultimately causing the boot to fail. Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Baoquan He <bhe@redhat.com> Cc: stable@vger.kernel.org Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: Pavel Machek <pavel@ucw.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140911161931.GA12001@www.outflux.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | Merge tag 'efi-urgent' of ↵Ingo Molnar2014-09-191-3/+3
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent Pull EFI fix from Matt Fleming: * Increase the number of early_ioremap() slots to fix a regression with earlyprintk=efi after recent changes to the ACPI code (Dave Young) Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * x86 early_ioremap: Increase FIX_BTMAPS_SLOTS to 8Dave Young2014-09-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 3.16 kernel boot fail with earlyprintk=efi, it keeps scrolling at the bottom line of screen. Bisected, the first bad commit is below: commit 86dfc6f339886559d80ee0d4bd20fe5ee90450f0 Author: Lv Zheng <lv.zheng@intel.com> Date: Fri Apr 4 12:38:57 2014 +0800 ACPICA: Tables: Fix table checksums verification before installation. I did some debugging by enabling both serial and efi earlyprintk, below is some debug dmesg, seems early_ioremap fails in scroll up function due to no free slot, see below dmesg output: WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:116 __early_ioremap+0x90/0x1c4() __early_ioremap(ed00c800, 00000c80) not found slot Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 3.17.0-rc1+ #204 Hardware name: Hewlett-Packard HP Z420 Workstation/1589, BIOS J61 v03.15 05/09/2013 Call Trace: dump_stack+0x4e/0x7a warn_slowpath_common+0x75/0x8e ? __early_ioremap+0x90/0x1c4 warn_slowpath_fmt+0x47/0x49 __early_ioremap+0x90/0x1c4 ? sprintf+0x46/0x48 early_ioremap+0x13/0x15 early_efi_map+0x24/0x26 early_efi_scroll_up+0x6d/0xc0 early_efi_write+0x1b0/0x214 call_console_drivers.constprop.21+0x73/0x7e console_unlock+0x151/0x3b2 ? vprintk_emit+0x49f/0x532 vprintk_emit+0x521/0x532 ? console_unlock+0x383/0x3b2 printk+0x4f/0x51 acpi_os_vprintf+0x2b/0x2d acpi_os_printf+0x43/0x45 acpi_info+0x5c/0x63 ? __acpi_map_table+0x13/0x18 ? acpi_os_map_iomem+0x21/0x147 acpi_tb_print_table_header+0x177/0x186 acpi_tb_install_table_with_override+0x4b/0x62 acpi_tb_install_standard_table+0xd9/0x215 ? early_ioremap+0x13/0x15 ? __acpi_map_table+0x13/0x18 acpi_tb_parse_root_table+0x16e/0x1b4 acpi_initialize_tables+0x57/0x59 acpi_table_init+0x50/0xce acpi_boot_table_init+0x1e/0x85 setup_arch+0x9b7/0xcc4 start_kernel+0x94/0x42d ? early_idt_handlers+0x120/0x120 x86_64_start_reservations+0x2a/0x2c x86_64_start_kernel+0xf3/0x100 Quote reply from Lv.zheng about the early ioremap slot usage in this case: """ In early_efi_scroll_up(), 2 mapping entries will be used for the src/dst screen buffer. In drivers/acpi/acpica/tbutils.c, we've improved the early table loading code in acpi_tb_parse_root_table(). We now need 2 mapping entries: 1. One mapping entry is used for RSDT table mapping. Each RSDT entry contains an address for another ACPI table. 2. For each entry in RSDP, we need another mapping entry to map the table to perform necessary check/override before installing it. When acpi_tb_parse_root_table() prints something through EFI earlyprintk console, we'll have 4 mapping entries used. The current 4 slots setting of early_ioremap() seems to be too small for such a use case. """ Thus increase the slot to 8 in this patch to fix this issue. boot-time mappings become 512 page with this patch. Signed-off-by: Dave Young <dyoung@redhat.com> Cc: <stable@vger.kernel.org> # v3.16 Signed-off-by: Matt Fleming <matt.fleming@intel.com>
* | | Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds2014-09-261-0/+3
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "A CONFIG_STACK_GROWSUP=y fix, and a hotplug llc CPU mask fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix unreleased llc_shared_mask bit during CPU hotplug sched: Fix end_of_stack() and location of stack canary for architectures using CONFIG_STACK_GROWSUP
| * | | sched: Fix unreleased llc_shared_mask bit during CPU hotplugWanpeng Li2014-09-241-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following bug can be triggered by hot adding and removing a large number of xen domain0's vcpus repeatedly: BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group PGD 5a9d5067 PUD 13067 PMD 0 Oops: 0000 [#3] SMP [...] Call Trace: load_balance ? _raw_spin_unlock_irqrestore idle_balance __schedule schedule schedule_timeout ? lock_timer_base schedule_timeout_uninterruptible msleep lock_device_hotplug_sysfs online_store dev_attr_store sysfs_write_file vfs_write SyS_write system_call_fastpath Last level cache shared mask is built during CPU up and the build_sched_domain() routine takes advantage of it to setup the sched domain CPU topology. However, llc_shared_mask is not released during CPU disable, which leads to an invalid sched domainCPU topology. This patch fix it by releasing the llc_shared_mask correctly during CPU disable. Yasuaki also reported that this can happen on real hardware: https://lkml.org/lkml/2014/7/22/1018 His case is here: == Here is an example on my system. My system has 4 sockets and each socket has 15 cores and HT is enabled. In this case, each core of sockes is numbered as follows: | CPU# Socket#0 | 0-14 , 60-74 Socket#1 | 15-29, 75-89 Socket#2 | 30-44, 90-104 Socket#3 | 45-59, 105-119 Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000. It means that last level cache of Socket#2 is shared with CPU#30-44 and 90-104. When hot-removing socket#2 and #3, each core of sockets is numbered as follows: | CPU# Socket#0 | 0-14 , 60-74 Socket#1 | 15-29, 75-89 But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30 remains having 0x3fff80000001fffc0000000. After that, when hot-adding socket#2 and #3, each core of sockets is numbered as follows: | CPU# Socket#0 | 0-14 , 60-74 Socket#1 | 15-29, 75-89 Socket#2 | 30-59 Socket#3 | 90-119 Then llc_shared_mask of CPU#30 becomes 0x3fff8000fffffffc0000000. It means that last level cache of Socket#2 is shared with CPU#30-59 and 90-104. So the mask has the wrong value. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Tested-by: Linn Crosetto <linn@hp.com> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Toshi Kani <toshi.kani@hp.com> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: <stable@vger.kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1411547885-48165-1-git-send-email-wanpeng.li@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds2014-09-241-2/+2
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull crypto fixes from Herbert Xu: "This fixes three issues: - if ccp is loaded on a machine without ccp, it will incorrectly activate causing all requests to fail. Fixed by preventing ccp from loading if hardware isn't available. - not all IRQs were enabled for the qat driver, leading to potential stalls when it is used - disabled buggy AVX CTR implementation in aesni" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: aesni - disable "by8" AVX CTR optimization crypto: ccp - Check for CCP before registering crypto algs crypto: qat - Enable all 32 IRQs
| * | | | crypto: aesni - disable "by8" AVX CTR optimizationMathias Krause2014-09-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "by8" implementation introduced in commit 22cddcc7df8f ("crypto: aes - AES CTR x86_64 "by8" AVX optimization") is failing crypto tests as it handles counter block overflows differently. It only accounts the right most 32 bit as a counter -- not the whole block as all other implementations do. This makes it fail the cryptomgr test #4 that specifically tests this corner case. As we're quite late in the release cycle, just disable the "by8" variant for now. Reported-by: Romain Francoise <romain@orebokech.com> Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* | | | | Revert "x86/efi: Fixup GOT in all boot code paths"Linus Torvalds2014-09-232-81/+29
| |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 9cb0e394234d244fe5a97e743ec9dd7ddff7e64b. It causes my Sony Vaio Pro 11 to immediately reboot at startup. Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge tag 'pci-v3.17-fixes-2' of ↵Linus Torvalds2014-09-191-23/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "These fix: - Boot video device detection on dual-GPU Apple systems - Hotplug fiascos on VGA switcheroo with radeon & nouveau drivers - Boot hang on Freescale i.MX6 systems - Excessive "no hotplug settings from platform" warnings In particular: Enumeration - Don't default exclusively to first video device (Bruno Prémont) PCI device hotplug - Remove "no hotplug settings from platform" warning (Bjorn Helgaas) - Add pci_ignore_hotplug() for VGA switcheroo (Bjorn Helgaas) Freescale i.MX6 - Put LTSSM in "Detect" state before disabling (Lucas Stach)" * tag 'pci-v3.17-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: vgaarb: Drop obsolete #ifndef vgaarb: Don't default exclusively to first video device with mem+io ACPIPHP / radeon / nouveau: Remove acpi_bus_no_hotplug() PCI: Remove "no hotplug settings from platform" warning PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device PCI: imx6: Put LTSSM in "Detect" state before disabling it MAINTAINERS: Add Lucas Stach as co-maintainer for i.MX6 PCI driver
| * | | | vgaarb: Don't default exclusively to first video device with mem+ioBruno Prémont2014-09-161-23/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 20cde694027e ("x86, ia64: Move EFI_FB vga_default_device() initialization to pci_vga_fixup()") moved boot video device detection from efifb to x86 and ia64 pci/fixup.c. For dual-GPU Apple computers above change represents a regression as code in efifb did forcefully override vga_default_device while the merge did not (vgaarb happens prior to PCI fixup). To improve on initial device selection by vgaarb (it cannot know if PCI device not behind bridges see/decode legacy VGA I/O or not), move the screen_info based check from pci_video_fixup() to vgaarb's init function and use it to refine/override decision taken while adding the individual PCI VGA devices. This way PCI fixup has no reason to adjust vga_default_device anymore but can depend on its value for flagging shadowed VBIOS. This has the nice benefit of removing duplicated code but does introduce a #if defined() block in vgaarb. Not all architectures have screen_info and would cause compile to fail without it. Link: https://bugzilla.kernel.org/show_bug.cgi?id=84461 Reported-and-Tested-By: Andreas Noever <andreas.noever@gmail.com> Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Matthew Garrett <matthew.garrett@nebula.com> CC: stable@vger.kernel.org # v3.5+
* | | | | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2014-09-191-1/+3
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two kernel side fixes: a kprobes fix and a perf_remove_from_context() fix (which does not yet fix the migration bug which is WIP)" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix a race condition in perf_remove_from_context() kprobes/x86: Free 'optinsn' cache when range check fails
| * | | | | kprobes/x86: Free 'optinsn' cache when range check failsWang Nan2014-08-271-1/+3
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch frees the 'optinsn' slot when we get a range check error, to prevent memory leaks. Before this patch, cache entry in kprobe_insn_cache() won't be freed if kprobe optimizing fails due to range check failure. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Pei Feiyue <peifeiyue@huawei.com> Link: http://lkml.kernel.org/r/1406550019-70935-1-git-send-email-wangnan0@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2014-09-196-37/+98
|\ \ \ \ \ | | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: EFI fixes, a build fix, a page table dumping (debug) fix and a clang build fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/arm64: Fix fdt-related memory reservation x86/mm: Apply the section attribute to the variable, not its type x86/efi: Fixup GOT in all boot code paths x86/efi: Only load initrd above 4g on second try x86-64, ptdump: Mark espfix area only if existent x86, irq: Fix build error caused by 9eabc99a635a77cbf09
| * | | | Merge tag 'efi-urgent' of ↵Ingo Molnar2014-09-093-36/+92
| |\ \ \ \ | | | |_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent Pull EFI fixes from Matt Fleming: * Fix early boot regression affecting x86 EFI boot stub when loading initrds above 4GB - Yinghai Lu * Relocate GOT entries in the x86 EFI boot stub now that we have symbols with global visibility - Matt Fleming * fdt memory reservation fix for arm64 - Mark Salter Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | x86/efi: Fixup GOT in all boot code pathsMatt Fleming2014-09-082-29/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Maarten reported that his Macbook pro 8.2 stopped booting after commit f23cf8bd5c1f49 ("efi/x86: efistub: Move shared dependencies to <asm/efi.h>"), the main feature of which is changing the visibility of symbol 'efi_early' from local to global. By making 'efi_early' global we end up requiring an entry in the Global Offset Table. Unfortunately, while we do include code to fixup GOT entries in the early boot code, it's only called after we've executed the EFI boot stub. What this amounts to is that references to 'efi_early' in the EFI boot stub don't point to the correct place. Since we've got multiple boot entry points we need to be prepared to fixup the GOT in multiple places, while ensuring that we never do it more than once, otherwise the GOT entries will still point to the wrong place. Reported-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Tested-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
| | * | | x86/efi: Only load initrd above 4g on second tryYinghai Lu2014-09-081-7/+11
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mantas found that after commit 4bf7111f5016 ("x86/efi: Support initrd loaded above 4G"), the kernel freezes at the earliest possible moment when trying to boot via UEFI on Asus laptop. Revert to old way to load initrd under 4G on first try, second try will use above 4G buffer when initrd is too big and does not fit under 4G. [ The cause of the freeze appears to be a firmware bug when reading file data into buffers above 4GB, though the exact reason is unknown. Mantas reports that the hang can be avoid if the file size is a multiple of 512 bytes, but I've seen some ASUS firmware simply corrupting the file data rather than freezing. Laszlo fixed an issue in the upstream EDK2 DiskIO code in Aug 2013 which may possibly be related, commit 4e39b75e ("MdeModulePkg/DiskIoDxe: fix source/destination pointer of overrun transfer"). Whatever the cause, it's unlikely that a fix will be forthcoming from the vendor, hence the workaround - Matt ] Cc: Laszlo Ersek <lersek@redhat.com> Reported-by: Mantas Mikulėnas <grawity@gmail.com> Reported-by: Harald Hoyer <harald@redhat.com> Tested-by: Anders Darander <anders@chargestorm.se> Tested-by: Calvin Walton <calvin.walton@kepstin.ca> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
| * | | x86/mm: Apply the section attribute to the variable, not its typeJan-Simon Möller2014-09-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a compilation error in clang in that a linker section attribute can't be added to a type: arch/x86/mm/mmap.c:34:8: error: '__section__' attribute only applies to functions and global variables struct __read_mostly ... By moving the section attribute to the variable declaration, the desired effect is achieved. Signed-off-by: Jan-Simon Möller <dl9pf@gmx.de> Signed-off-by: Behan Webster <behanw@converseincode.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1409959005-11479-1-git-send-email-behanw@converseincode.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | x86-64, ptdump: Mark espfix area only if existentMathias Krause2014-09-081-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should classify the espfix area as such only if we actually have enabled the corresponding option. Otherwise the page table dump might look confusing. Signed-off-by: Mathias Krause <minipli@googlemail.com> Link: http://lkml.kernel.org/r/1410114629-24523-1-git-send-email-minipli@googlemail.com Cc: Arjan van de Ven <arjan.van.de.ven@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | x86, irq: Fix build error caused by 9eabc99a635a77cbf09Jiang Liu2014-09-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 9eabc99a635a77cbf09 causes following build error when IOAPIC is disabled. arch/x86/pci/irq.c: In function 'pirq_disable_irq': >> arch/x86/pci/irq.c:1259:2: error: implicit declaration of function 'mp_should_keep_irq' [-Werror=implicit-function-declaration] if (io_apic_assign_pci_irqs && !mp_should_keep_irq(&dev->dev) && ^ cc1: some warnings being treated as errors Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Grant Likely <grant.likely@linaro.org> Link: http://lkml.kernel.org/r/1409382916-10649-1-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | | | Make ARCH_HAS_FAST_MULTIPLIER a real config variableLinus Torvalds2014-09-132-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It used to be an ad-hoc hack defined by the x86 version of <asm/bitops.h> that enabled a couple of library routines to know whether an integer multiply is faster than repeated shifts and additions. This just makes it use the real Kconfig system instead, and makes x86 (which was the only architecture that did this) select the option. NOTE! Even for x86, this really is kind of wrong. If we cared, we would probably not enable this for builds optimized for netburst (P4), where shifts-and-adds are generally faster than multiplies. This patch does *not* change that kind of logic, though, it is purely a syntactic change with no code changes. This was triggered by the fact that we have other places that really want to know "do I want to expand multiples by constants by hand or not", particularly the hash generation code. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge tag 'stable/for-linus-3.17-b-rc4-tag' of ↵Linus Torvalds2014-09-122-15/+13
|\ \ \ \ | |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen bug fixes from David Vrabel: - fix for PVHVM suspend/resume and migration - don't pointlessly retry certain ballooning ops - fix gntalloc when grefs have run out. - fix PV boot if KSALR is enable or very large modules are used. * tag 'stable/for-linus-3.17-b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: don't copy bogus duplicate entries into kernel page tables xen/gntalloc: safely delete grefs in add_grefs() undo path xen/gntalloc: fix oops after runnning out of grant refs xen/balloon: cancel ballooning if adding new memory failed xen/manage: Always freeze/thaw processes when suspend/resuming
| * | | x86/xen: don't copy bogus duplicate entries into kernel page tablesStefan Bader2014-09-102-15/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When RANDOMIZE_BASE (KASLR) is enabled; or the sum of all loaded modules exceeds 512 MiB, then loading modules fails with a warning (and hence a vmalloc allocation failure) because the PTEs for the newly-allocated vmalloc address space are not zero. WARNING: CPU: 0 PID: 494 at linux/mm/vmalloc.c:128 vmap_page_range_noflush+0x2a1/0x360() This is caused by xen_setup_kernel_pagetables() copying level2_kernel_pgt into level2_fixmap_pgt, overwriting many non-present entries. Without KASLR, the normal kernel image size only covers the first half of level2_kernel_pgt and module space starts after that. L4[511]->level3_kernel_pgt[510]->level2_kernel_pgt[ 0..255]->kernel [256..511]->module [511]->level2_fixmap_pgt[ 0..505]->module This allows 512 MiB of of module vmalloc space to be used before having to use the corrupted level2_fixmap_pgt entries. With KASLR enabled, the kernel image uses the full PUD range of 1G and module space starts in the level2_fixmap_pgt. So basically: L4[511]->level3_kernel_pgt[510]->level2_kernel_pgt[0..511]->kernel [511]->level2_fixmap_pgt[0..505]->module And now no module vmalloc space can be used without using the corrupt level2_fixmap_pgt entries. Fix this by properly converting the level2_fixmap_pgt entries to MFNs, and setting level1_fixmap_pgt as read-only. A number of comments were also using the the wrong L3 offset for level2_kernel_pgt. These have been corrected. Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: stable@vger.kernel.org
* | | | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2014-08-306-4/+33
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "One patch to avoid assigning interrupts we don't actually have on non-PC platforms, and two patches that addresses bugs in the new IOAPIC assignment code" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, irq, PCI: Keep IRQ assignment for runtime power management x86: irq: Fix bug in setting IOAPIC pin attributes x86: Fix non-PC platform kernel crash on boot due to NULL dereference
| * | | | x86, irq, PCI: Keep IRQ assignment for runtime power managementJiang Liu2014-08-294-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now IOAPIC driver dynamically allocates IRQ numbers for IOAPIC pins. We need to keep IRQ assignment for PCI devices during runtime power management, otherwise it may cause failure of device wakeups. Commit 3eec595235c17a7 "x86, irq, PCI: Keep IRQ assignment for PCI devices during suspend/hibernation" has fixed the issue for suspend/ hibernation, we also need the same fix for runtime device sleep too. Fix: https://bugzilla.kernel.org/show_bug.cgi?id=83271 Reported-and-Tested-by: EmanueL Czirai <amanual@openmailbox.org> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: EmanueL Czirai <amanual@openmailbox.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Grant Likely <grant.likely@linaro.org> Link: http://lkml.kernel.org/r/1409304383-18806-1-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | x86: irq: Fix bug in setting IOAPIC pin attributesJiang Liu2014-08-271-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 15a3c7cc9154321fc3 "x86, irq: Introduce two helper functions to support irqdomain map operation" breaks LPSS ACPI enumerated devices. On startup, IOAPIC driver preallocates IRQ descriptors and programs IOAPIC pins with default level and polarity attributes for all legacy IRQs. Later legacy IRQ users may fail to set IOAPIC pin attributes if the requested attributes conflicts with the default IOAPIC pin attributes. So change mp_irqdomain_map() to allow the first legacy IRQ user to reprogram IOAPIC pin with different attributes. Reported-and-tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Grant Likely <grant.likely@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1409118795-17046-1-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | x86: Fix non-PC platform kernel crash on boot due to NULL dereferenceAndy Shevchenko2014-08-252-1/+3
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream commit: 95d76acc7518d5 ("x86, irq: Count legacy IRQs by legacy_pic->nr_legacy_irqs instead of NR_IRQS_LEGACY") removed reserved interrupts for the platforms that do not have a legacy IOAPIC. Which breaks the boot on Intel MID platforms such as Medfield: BUG: unable to handle kernel NULL pointer dereference at 0000003a IP: [<c107079a>] setup_irq+0xf/0x4d [ 0.000000] *pdpt = 0000000000000000 *pde = 9bbf32453167e510 The culprit is an uncoditional setting of IRQ2 which is used as cascade IRQ on legacy platforms. It seems we have to check if we have enough legacy IRQs reserved before we can call setup_irq(). The fix adds such check in native_init_IRQ() and in setup_default_timer_irq(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Cohen <david.a.cohen@linux.intel.com> Link: http://lkml.kernel.org/r/1405931920-12871-1-git-send-email-andriy.shevchenko@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | kexec: purgatory: add clean-up for purgatory directoryMichael Welling2014-08-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without this patch the kexec-purgatory.c and purgatory.ro files are not removed after make mrproper. Signed-off-by: Michael Welling <mwelling@ieee.org> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | x86/purgatory: use approprate -m64/-32 build flag for arch/x86/purgatoryVivek Goyal2014-08-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Thomas reported that build of x86_64 kernel was failing for him. He is using 32bit tool chain. Problem is that while compiling purgatory, I have not specified -m64 flag. And 32bit tool chain must be assuming -m32 by default. Following is error message. (mini) [~/work/linux-2.6] make scripts/kconfig/conf --silentoldconfig Kconfig CHK include/config/kernel.release UPD include/config/kernel.release CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h UPD include/generated/utsrelease.h CC arch/x86/purgatory/purgatory.o arch/x86/purgatory/purgatory.c:1:0: error: code model 'large' not supported in the 32 bit mode Fix it by explicitly passing appropriate -m64/-m32 build flag for purgatory. Reported-by: Thomas Glanzmann <thomas@glanzmann.de> Tested-by: Thomas Glanzmann <thomas@glanzmann.de> Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | kexec: create a new config option CONFIG_KEXEC_FILE for new syscallVivek Goyal2014-08-307-20/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently new system call kexec_file_load() and all the associated code compiles if CONFIG_KEXEC=y. But new syscall also compiles purgatory code which currently uses gcc option -mcmodel=large. This option seems to be available only gcc 4.4 onwards. Hiding new functionality behind a new config option will not break existing users of old gcc. Those who wish to enable new functionality will require new gcc. Having said that, I am trying to figure out how can I move away from using -mcmodel=large but that can take a while. I think there are other advantages of introducing this new config option. As this option will be enabled only on x86_64, other arches don't have to compile generic kexec code which will never be used. This new code selects CRYPTO=y and CRYPTO_SHA256=y. And all other arches had to do this for CONFIG_KEXEC. Now with introduction of new config option, we can remove crypto dependency from other arches. Now CONFIG_KEXEC_FILE is available only on x86_64. So whereever I had CONFIG_X86_64 defined, I got rid of that. For CONFIG_KEXEC_FILE, instead of doing select CRYPTO=y, I changed it to "depends on CRYPTO=y". This should be safer as "select" is not recursive. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: H. Peter Anvin <hpa@zytor.com> Tested-by: Shaun Ruffell <sruffell@digium.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | x86,mm: fix pte_special versus pte_numaHugh Dickins2014-08-301-2/+7
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sasha Levin has shown oopses on ffffea0003480048 and ffffea0003480008 at mm/memory.c:1132, running Trinity on different 3.16-rc-next kernels: where zap_pte_range() checks page->mapping to see if PageAnon(page). Those addresses fit struct pages for pfns d2001 and d2000, and in each dump a register or a stack slot showed d2001730 or d2000730: pte flags 0x730 are PCD ACCESSED PROTNONE SPECIAL IOMAP; and Sasha's e820 map has a hole between cfffffff and 100000000, which would need special access. Commit c46a7c817e66 ("x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels") has broken vm_normal_page(): a PROTNONE SPECIAL pte no longer passes the pte_special() test, so zap_pte_range() goes on to try to access a non-existent struct page. Fix this by refining pte_special() (SPECIAL with PRESENT or PROTNONE) to complement pte_numa() (SPECIAL with neither PRESENT nor PROTNONE). A hint that this was a problem was that c46a7c817e66 added pte_numa() test to vm_normal_page(), and moved its is_zero_pfn() test from slow to fast path: This was papering over a pte_special() snag when the zero page was encountered during zap. This patch reverts vm_normal_page() to how it was before, relying on pte_special(). It still appears that this patch may be incomplete: aren't there other places which need to be handling PROTNONE along with PRESENT? For example, pte_mknuma() clears _PAGE_PRESENT and sets _PAGE_NUMA, but on a PROT_NONE area, that would make it pte_special(). This is side-stepped by the fact that NUMA hinting faults skipped PROT_NONE VMAs and there are no grounds where a NUMA hinting fault on a PROT_NONE VMA would be interesting. Fixes: c46a7c817e66 ("x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels") Reported-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: <stable@vger.kernel.org> [3.16] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2014-08-252-3/+9
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "A couple of EFI fixes, plus misc fixes all around the map" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/arm64: Store Runtime Services revision firmware: Do not use WARN_ON(!spin_is_locked()) x86_32, entry: Clean up sysenter_badsys declaration x86/doc: Fix the 'tlb_single_page_flush_ceiling' sysconfig path x86/mm: Fix sparse 'tlb_single_page_flush_ceiling' warning and make the variable read-mostly x86/mm: Fix RCU splat from new TLB tracepoints
| * \ \ Merge tag 'efi-urgent' of ↵Ingo Molnar2014-08-2294-1400/+3547
| |\ \ \ | | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent Pull EFI fixes from Matt Fleming: * WARN_ON(!spin_is_locked()) always triggers on non-SMP machines. Swap it for the more canonical lockdep_assert_held() which always does the right thing - Guenter Roeck * Assign the correct value to efi.runtime_version on arm64 so that all the runtime services can be invoked - Semen Protsenko Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | x86_32, entry: Clean up sysenter_badsys declarationStefan Bader2014-08-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 554086d85e "x86_32, entry: Do syscall exit work on badsys (CVE-2014-4508)" introduced a new jump label (sysenter_badsys) but somehow the END statements seem to have gone wrong (at least it feels that way to me). This does not seem to be a fatal problem, but just for the sake of symmetry, change the second syscall_badsys to sysenter_badsys. Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Link: http://lkml.kernel.org/r/1408093066-31021-1-git-send-email-stefan.bader@canonical.com Acked-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | x86/mm: Fix sparse 'tlb_single_page_flush_ceiling' warning and make the ↵Jeremiah Mahler2014-08-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | variable read-mostly A sparse warning is generated about 'tlb_single_page_flush_ceiling' not being declared. arch/x86/mm/tlb.c:177:15: warning: symbol 'tlb_single_page_flush_ceiling' was not declared. Should it be static? Since it isn't used anywhere outside this file, fix the warning by making it static. Also, optimize the use of this variable by adding the __read_mostly directive, as suggested by David Rientjes. Suggested-by: David Rientjes <rientjes@google.com> Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Link: http://lkml.kernel.org/r/1407569913-4035-1-git-send-email-jmmahler@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | x86/mm: Fix RCU splat from new TLB tracepointsDave Hansen2014-08-081-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dave Jones reported seeing a bug from one of my TLB tracepoints: http://lkml.kernel.org/r/20140806181801.GA4605@redhat.com According to Paul McKenney, the right way to fix this is adding an _rcuidle suffix to the tracepoint. http://lkml.kernel.org/r/20140807065055.GA5821@linux.vnet.ibm.com This patch does just that. Reported-by: Dave Jones <davej@redhat.com>, Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Dave Hansen <dave@sr71.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140807175841.5C92D878@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | Revert "KVM: x86: Increase the number of fixed MTRR regs to 10"Paolo Bonzini2014-08-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 682367c494869008eb89ef733f196e99415ae862, which causes 32-bit SMP Windows 7 guests to panic. SeaBIOS has a limit on the number of MTRRs that it can handle, and this patch exceeded the limit. Better revert it. Thanks to Nadav Amit for debugging the cause. Cc: stable@nongnu.org Reported-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | | KVM: x86: do not check CS.DPL against RPL during task switchPaolo Bonzini2014-08-191-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts the check added by commit 5045b468037d (KVM: x86: check CS.DPL against RPL during task switch, 2014-05-15). Although the CS.DPL=CS.RPL check is mentioned in table 7-1 of the SDM as causing a #TSS exception, it is not mentioned in table 6-6 that lists "invalid TSS conditions" which cause #TSS exceptions. In fact it causes some tests to fail, which pass on bare-metal. Keep the rest of the commit, since we will find new uses for it in 3.18. Reported-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | | KVM: x86: Avoid emulating instructions on #UD mistakenlyNadav Amit2014-08-191-4/+4
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d40a6898e5 mistakenly caused instructions which are not marked as EmulateOnUD to be emulated upon #UD exception. The commit caused the check of whether the instruction flags include EmulateOnUD to never be evaluated. As a result instructions whose emulation is broken may be emulated. This fix moves the evaluation of EmulateOnUD so it would be evaluated. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> [Tweak operand order in &&, remove EmulateOnUD where it's now superfluous. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | Merge branch 'release' of ↵Linus Torvalds2014-08-161-0/+3
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull idle update from Len Brown: "Two Intel-platform-specific updates to intel_idle, and a cosmetic tweak to the turbostat utility" * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: tweak whitespace in output format intel_idle: Broadwell support intel_idle: Disable Baytrail Core and Module C6 auto-demotion
| * | | intel_idle: Disable Baytrail Core and Module C6 auto-demotionLen Brown2014-08-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Power efficiency improves on Baytrail (Intel Atom Processor E3000) when Linux disables C6 auto-demotion. Based on work by Srinidhi Kasagar <srinidhi.kasagar@intel.com>. Signed-off-by: Len Brown <len.brown@intel.com> Cc: x86@kernel.org
* | | | Merge tag 'pci-v3.17-changes-2' of ↵Linus Torvalds2014-08-152-6/+6
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull DEFINE_PCI_DEVICE_TABLE removal from Bjorn Helgaas: "Part two of the PCI changes for v3.17: - Remove DEFINE_PCI_DEVICE_TABLE macro use (Benoit Taine) It's a mechanical change that removes uses of the DEFINE_PCI_DEVICE_TABLE macro. I waited until later in the merge window to reduce conflicts, but it's possible you'll still see a few" * tag 'pci-v3.17-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Remove DEFINE_PCI_DEVICE_TABLE macro use
| * | | | PCI: Remove DEFINE_PCI_DEVICE_TABLE macro useBenoit Taine2014-08-122-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should prefer `struct pci_device_id` over `DEFINE_PCI_DEVICE_TABLE` to meet kernel coding style guidelines. This issue was reported by checkpatch. A simplified version of the semantic patch that makes this change is as follows (http://coccinelle.lip6.fr/): // <smpl> @@ identifier i; declarer name DEFINE_PCI_DEVICE_TABLE; initializer z; @@ - DEFINE_PCI_DEVICE_TABLE(i) + const struct pci_device_id i[] = z; // </smpl> [bhelgaas: add semantic patch] Signed-off-by: Benoit Taine <benoit.taine@lip6.fr> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* | | | | Merge tag 'stable/for-linus-3.17-b-rc0-tag' of ↵Linus Torvalds2014-08-142-6/+6
|\ \ \ \ \ | | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen bugfixes from David Vrabel: - fix ARM build - fix boot crash with PVH guests - improve reliability of resume/migration * tag 'stable/for-linus-3.17-b-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: use vmap() to map grant table pages in PVH guests x86/xen: resume timer irqs early arm/xen: remove duplicate arch_gnttab_init() function
| * | | | x86/xen: use vmap() to map grant table pages in PVH guestsDavid Vrabel2014-08-111-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit b7dd0e350e0b (x86/xen: safely map and unmap grant frames when in atomic context) causes PVH guests to crash in arch_gnttab_map_shared() when they attempted to map the pages for the grant table. This use of a PV-specific function during the PVH grant table setup is non-obvious and not needed. The standard vmap() function does the right thing. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reported-by: Mukesh Rathor <mukesh.rathor@oracle.com> Tested-by: Mukesh Rathor <mukesh.rathor@oracle.com> Cc: stable@vger.kernel.org
| * | | | x86/xen: resume timer irqs earlyDavid Vrabel2014-08-111-1/+1
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the timer irqs are resumed during device resume it is possible in certain circumstances for the resume to hang early on, before device interrupts are resumed. For an Ubuntu 14.04 PVHVM guest this would occur in ~0.5% of resume attempts. It is not entirely clear what is occuring the point of the hang but I think a task necessary for the resume calls schedule_timeout(), waiting for a timer interrupt (which never arrives). This failure may require specific tasks to be running on the other VCPUs to trigger (processes are not frozen during a suspend/resume if PREEMPT is disabled). Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in syscore_resume(). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: stable@vger.kernel.org