summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* net: socket: move check for forbid_cmsg_compat to __sys_...msg()Dominik Brodowski2018-04-023-22/+37
| | | | | | | | | | | | | | | | | | The non-compat codepaths for sys_...msg() verify that MSG_CMSG_COMPAT is not set. By moving this check to the __sys_...msg() functions (and making it dependent on a static flag passed to this function), we can call the __sys...msg() functions instead of the syscall functions in all cases. __sys_recvmmsg() does not need this trickery, as the check is handled within the do_sys_recvmmsg() function internal to net/socket.c. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add do_sys_recvmmsg() helper; remove in-kernel call to syscallDominik Brodowski2018-04-021-5/+12
| | | | | | | | | | | | | Using the net-internal helper do_sys_recvmmsg() allows us to avoid the internal calls to the sys_getsockopt() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add __sys_getsockopt() helper; remove in-kernel call to syscallDominik Brodowski2018-04-021-4/+10
| | | | | | | | | | | | | Using the net-internal helper __sys_getsockopt() allows us to avoid the internal calls to the sys_getsockopt() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add __sys_setsockopt() helper; remove in-kernel call to syscallDominik Brodowski2018-04-022-3/+11
| | | | | | | | | | | | | Using the net-internal helper __sys_setsockopt() allows us to avoid the internal calls to the sys_setsockopt() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add __sys_shutdown() helper; remove in-kernel call to syscallDominik Brodowski2018-04-023-3/+9
| | | | | | | | | | | | | Using the net-internal helper __sys_shutdown() allows us to avoid the internal calls to the sys_shutdown() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add __sys_socketpair() helper; remove in-kernel call to syscallDominik Brodowski2018-04-023-4/+11
| | | | | | | | | | | | | Using the net-internal helper __sys_socketpair() allows us to avoid the internal calls to the sys_socketpair() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add __sys_getpeername() helper; remove in-kernel call to syscallDominik Brodowski2018-04-023-5/+13
| | | | | | | | | | | | | Using the net-internal helper __sys_getpeername() allows us to avoid the internal calls to the sys_getpeername() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add __sys_getsockname() helper; remove in-kernel call to syscallDominik Brodowski2018-04-023-5/+13
| | | | | | | | | | | | | Using the net-internal helper __sys_getsockname() allows us to avoid the internal calls to the sys_getsockname() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add __sys_listen() helper; remove in-kernel call to syscallDominik Brodowski2018-04-023-3/+9
| | | | | | | | | | | | | Using the net-internal helper __sys_listen() allows us to avoid the internal calls to the sys_listen() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add __sys_connect() helper; remove in-kernel call to syscallDominik Brodowski2018-04-023-4/+11
| | | | | | | | | | | | | Using the net-internal helper __sys_connect() allows us to avoid the internal calls to the sys_connect() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add __sys_bind() helper; remove in-kernel call to syscallDominik Brodowski2018-04-023-3/+9
| | | | | | | | | | | | | Using the net-internal helper __sys_bind() allows us to avoid the internal calls to the sys_bind() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add __sys_socket() helper; remove in-kernel call to syscallDominik Brodowski2018-04-023-3/+9
| | | | | | | | | | | | | Using the net-internal helper __sys_socket() allows us to avoid the internal calls to the sys_socket() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add __sys_accept4() helper; remove in-kernel call to syscallDominik Brodowski2018-04-023-9/+17
| | | | | | | | | | | | | Using the net-internal helper __sys_accept4() allows us to avoid the internal calls to the sys_accept4() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add __sys_sendto() helper; remove in-kernel call to syscallDominik Brodowski2018-04-023-8/+17
| | | | | | | | | | | | | Using the net-internal helper __sys_sendto() allows us to avoid the internal calls to the sys_sendto() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* net: socket: add __sys_recvfrom() helper; remove in-kernel call to syscallDominik Brodowski2018-04-023-9/+21
| | | | | | | | | | | | | Using the net-internal helper __sys_recvfrom() allows us to avoid the internal calls to the sys_recvfrom() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* x86: remove compat_sys_x86_waitpid()Dominik Brodowski2018-04-023-10/+1
| | | | | | | | | | | | compat_sys_x86_waitpid() is not needed, as it takes the same parameters (int, *int, int) as the native syscall. Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Slaby <jslaby@suse.com> Cc: x86@kernel.org Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* x86: use _do_fork() in compat_sys_x86_clone()Dominik Brodowski2018-04-021-1/+2
| | | | | | | | | | | | | | | It is trivial to directly call _do_fork() instead of the sys_clone() syscall in compat_sys_x86_clone(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Slaby <jslaby@suse.com> Cc: x86@kernel.org Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* mm: use do_futex() instead of sys_futex() in mm_release()Dominik Brodowski2018-04-022-5/+12
| | | | | | | | | | | | | | | | | | | | | | | sys_futex() is a wrapper to do_futex() which does not modify any values here: - uaddr, val and val3 are kept the same - op is masked with FUTEX_CMD_MASK, but is always set to FUTEX_WAKE. Therefore, val2 is always 0. - as utime is set to NULL, *timeout is NULL This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Darren Hart <dvhart@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* kexec: call do_kexec_load() in compat syscall directlyDominik Brodowski2018-04-021-13/+39
| | | | | | | | | | | | | | | | | do_kexec_load() can be called directly by compat_sys_kexec() as long as the same parameters checks are completed which are currently handled (also) by sys_kexec(). Therefore, move those to kexec_load_check(), call that newly introduced helper function from both sys_kexec() and compat_sys_kexec(), and duplicate the remaining code from sys_kexec() in compat_sys_kexec(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Eric Biederman <ebiederm@xmission.com> Cc: kexec@lists.infradead.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* kernel: open-code sys_rt_sigpending() in sys_sigpending()Dominik Brodowski2018-04-022-4/+13
| | | | | | | | | | | | | | A similar but not fully equivalent code path is already open-coded three times (in sys_rt_sigpending and in the two compat stubs), so do it a fourth time here. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* kernel: use kernel_wait4() instead of sys_wait4()Dominik Brodowski2018-04-023-6/+6
| | | | | | | | | | | | | | | All call sites of sys_wait4() set *rusage to NULL. Therefore, there is no need for the copy_to_user() handling of *rusage, and we can use kernel_wait4() directly. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Acked-by: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* syscalls: define and explain goal to not call syscalls in the kernelDominik Brodowski2018-03-252-0/+39
| | | | | | | | | | | The syscall entry points to the kernel defined by SYSCALL_DEFINEx() and COMPAT_SYSCALL_DEFINEx() should only be called from userspace through kernel entry points, but not from the kernel itself. This will allow cleanups and optimizations to the entry paths *and* to the parts of the kernel code which currently need to pretend to be userspace in order to make use of syscalls. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* Linux 4.16-rc5v4.16-rc5Linus Torvalds2018-03-121-1/+1
|
* Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2018-03-1117-182/+291
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/pti updates from Thomas Gleixner: "Yet another pile of melted spectrum related updates: - Drop native vsyscall support finally as it causes more trouble than benefit. - Make microcode loading more robust. There were a few issues especially related to late loading which are now surfacing because late loading of the IB* microcodes addressing spectre issues has become more widely used. - Simplify and robustify the syscall handling in the entry code - Prevent kprobes on the entry trampoline code which lead to kernel crashes when the probe hits before CR3 is updated - Don't check microcode versions when running on hypervisors as they are considered as lying anyway. - Fix the 32bit objtool build and a coment typo" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kprobes: Fix kernel crash when probing .entry_trampoline code x86/pti: Fix a comment typo x86/microcode: Synchronize late microcode loading x86/microcode: Request microcode on the BSP x86/microcode/intel: Look into the patch cache first x86/microcode: Do not upload microcode if CPUs are offline x86/microcode/intel: Writeback and invalidate caches before updating microcode x86/microcode/intel: Check microcode revision before updating sibling threads x86/microcode: Get rid of struct apply_microcode_ctx x86/spectre_v2: Don't check microcode versions when running under hypervisors x86/vsyscall/64: Drop "native" vsyscalls x86/entry/64/compat: Save one instruction in entry_INT80_compat() x86/entry: Do not special-case clone(2) in compat entry x86/syscalls: Use COMPAT_SYSCALL_DEFINEx() macros for x86-only compat syscalls x86/syscalls: Use proper syscall definition for sys_ioperm() x86/entry: Remove stale syscall prototype x86/syscalls/32: Simplify $entry == $compat entries objtool: Fix 32-bit build
| * x86/kprobes: Fix kernel crash when probing .entry_trampoline codeFrancis Deslauriers2018-03-093-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Disable the kprobe probing of the entry trampoline: .entry_trampoline is a code area that is used to ensure page table isolation between userspace and kernelspace. At the beginning of the execution of the trampoline, we load the kernel's CR3 register. This has the effect of enabling the translation of the kernel virtual addresses to physical addresses. Before this happens most kernel addresses can not be translated because the running process' CR3 is still used. If a kprobe is placed on the trampoline code before that change of the CR3 register happens the kernel crashes because int3 handling pages are not accessible. To fix this, add the .entry_trampoline section to the kprobe blacklist to prohibit the probing of code before all the kernel pages are accessible. Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: mathieu.desnoyers@efficios.com Cc: mhiramat@kernel.org Link: http://lkml.kernel.org/r/1520565492-4637-2-git-send-email-francis.deslauriers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86/pti: Fix a comment typoSeunghun Han2018-03-081-1/+1
| | | | | | | | | | | | | | | | | | s/visinble/visible/ Signed-off-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1520397135-132809-1-git-send-email-kkamagui@gmail.com
| * x86/microcode: Synchronize late microcode loadingAshok Raj2018-03-081-26/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Original idea by Ashok, completely rewritten by Borislav. Before you read any further: the early loading method is still the preferred one and you should always do that. The following patch is improving the late loading mechanism for long running jobs and cloud use cases. Gather all cores and serialize the microcode update on them by doing it one-by-one to make the late update process as reliable as possible and avoid potential issues caused by the microcode update. [ Borislav: Rewrite completely. ] Co-developed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: https://lkml.kernel.org/r/20180228102846.13447-8-bp@alien8.de
| * x86/microcode: Request microcode on the BSPBorislav Petkov2018-03-081-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ... so that any newer version can land in the cache and can later be fished out by the application functions. Do that before grabbing the hotplug lock. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: https://lkml.kernel.org/r/20180228102846.13447-7-bp@alien8.de
| * x86/microcode/intel: Look into the patch cache firstBorislav Petkov2018-03-081-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cache might contain a newer patch - look in there first. A follow-on change will make sure newest patches are loaded into the cache of microcode patches. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: https://lkml.kernel.org/r/20180228102846.13447-6-bp@alien8.de
| * x86/microcode: Do not upload microcode if CPUs are offlineAshok Raj2018-03-081-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid loading microcode if any of the CPUs are offline, and issue a warning. Having different microcode revisions on the system at any time is outright dangerous. [ Borislav: Massage changelog. ] Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: http://lkml.kernel.org/r/1519352533-15992-4-git-send-email-ashok.raj@intel.com Link: https://lkml.kernel.org/r/20180228102846.13447-5-bp@alien8.de
| * x86/microcode/intel: Writeback and invalidate caches before updating microcodeAshok Raj2018-03-081-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updating microcode is less error prone when caches have been flushed and depending on what exactly the microcode is updating. For example, some of the issues around certain Broadwell parts can be addressed by doing a full cache flush. [ Borislav: Massage it and use native_wbinvd() in both cases. ] Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: http://lkml.kernel.org/r/1519352533-15992-3-git-send-email-ashok.raj@intel.com Link: https://lkml.kernel.org/r/20180228102846.13447-4-bp@alien8.de
| * x86/microcode/intel: Check microcode revision before updating sibling threadsAshok Raj2018-03-081-3/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After updating microcode on one of the threads of a core, the other thread sibling automatically gets the update since the microcode resources on a hyperthreaded core are shared between the two threads. Check the microcode revision on the CPU before performing a microcode update and thus save us the WRMSR 0x79 because it is a particularly expensive operation. [ Borislav: Massage changelog and coding style. ] Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: http://lkml.kernel.org/r/1519352533-15992-2-git-send-email-ashok.raj@intel.com Link: https://lkml.kernel.org/r/20180228102846.13447-3-bp@alien8.de
| * x86/microcode: Get rid of struct apply_microcode_ctxBorislav Petkov2018-03-081-11/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is a useless remnant from earlier times. Use the ucode_state enum directly. No functional change. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: https://lkml.kernel.org/r/20180228102846.13447-2-bp@alien8.de
| * x86/spectre_v2: Don't check microcode versions when running under hypervisorsKonrad Rzeszutek Wilk2018-03-081-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As: 1) It's known that hypervisors lie about the environment anyhow (host mismatch) 2) Even if the hypervisor (Xen, KVM, VMWare, etc) provided a valid "correct" value, it all gets to be very murky when migration happens (do you provide the "new" microcode of the machine?). And in reality the cloud vendors are the ones that should make sure that the microcode that is running is correct and we should just sing lalalala and trust them. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: kvm <kvm@vger.kernel.org> Cc: Krčmář <rkrcmar@redhat.com> Cc: Borislav Petkov <bp@alien8.de> CC: "H. Peter Anvin" <hpa@zytor.com> CC: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180226213019.GE9497@char.us.oracle.com
| * x86/vsyscall/64: Drop "native" vsyscallsAndy Lutomirski2018-03-084-30/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since Linux v3.2, vsyscalls have been deprecated and slow. From v3.2 on, Linux had three vsyscall modes: "native", "emulate", and "none". "emulate" is the default. All known user programs work correctly in emulate mode, but vsyscalls turn into page faults and are emulated. This is very slow. In "native" mode, the vsyscall page is easily usable as an exploit gadget, but vsyscalls are a bit faster -- they turn into normal syscalls. (This is in contrast to vDSO functions, which can be much faster than syscalls.) In "none" mode, there are no vsyscalls. For all practical purposes, "native" was really just a chicken bit in case something went wrong with the emulation. It's been over six years, and nothing has gone wrong. Delete it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/519fee5268faea09ae550776ce969fa6e88668b0.1520449896.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86/entry/64/compat: Save one instruction in entry_INT80_compat()Dominik Brodowski2018-03-071-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As %rdi is never user except in the following push, there is no need to restore %rdi to the original value. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86/entry: Do not special-case clone(2) in compat entryDominik Brodowski2018-03-074-13/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the CPU renaming registers on its own, and all the overhead of the syscall entry/exit, it is doubtful whether the compiled output of mov %r8, %rax mov %rcx, %r8 mov %rax, %rcx jmpq sys_clone is measurably slower than the hand-crafted version of xchg %r8, %rcx So get rid of this special case. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86/syscalls: Use COMPAT_SYSCALL_DEFINEx() macros for x86-only compat syscallsDominik Brodowski2018-03-073-62/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While at it, convert declarations of type "unsigned" to "unsigned int". Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86/syscalls: Use proper syscall definition for sys_ioperm()Dominik Brodowski2018-03-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using SYSCALL_DEFINEx() is recommended, so use it also here. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86/entry: Remove stale syscall prototypeDominik Brodowski2018-03-071-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sys32_vm86_warning() is long gone. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86/syscalls/32: Simplify $entry == $compat entriesDominik Brodowski2018-03-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the compat entry point is equivalent to the native entry point, it does not need to be specified explicitly. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * objtool: Fix 32-bit buildJosh Poimboeuf2018-03-071-20/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the objtool build when cross-compiling a 64-bit kernel on a 32-bit host. This also simplifies read_retpoline_hints() a bit and makes its implementation similar to most of the other annotation reading functions. Reported-by: Sven Joachim <svenjoac@gmx.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: b5bc2231b8ad ("objtool: Add retpoline validation") Link: http://lkml.kernel.org/r/2ca46c636c23aa9c9d57d53c75de4ee3ddf7a7df.1520380691.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds2018-03-111-0/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "Just a single fix which adds a missing Kconfig dependency to avoid unmet dependency warnings" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/atmel-st: Add 'depends on HAS_IOMEM' to fix unmet dependency
| * | clocksource/atmel-st: Add 'depends on HAS_IOMEM' to fix unmet dependencyMasahiro Yamada2018-03-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ATMEL_ST config selects MFD_SYSCON, but does not depend on HAS_IOMEM. Compile testing on architecture without HAS_IOMEM causes "unmet direct dependencies" in Kconfig phase. Detected by "make ARCH=score allyesconfig". Add the proper dependency to the ATMEL_ST config. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Link: https://lkml.kernel.org/r/1520335233-11277-1-git-send-email-yamada.masahiro@socionext.com
* | | Merge branch 'ras-urgent-for-linus' of ↵Linus Torvalds2018-03-112-2/+25
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS fixes from Thomas Gleixner: "Two small fixes for RAS/MCE: - Serialize sysfs changes to avoid concurrent modificaiton of underlying data - Add microcode revision to Machine Check records. This should have been there forever, but now with the broken microcode versions in the wild it has become important" * 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/MCE: Serialize sysfs changes x86/MCE: Save microcode revision in machine check records
| * | | x86/MCE: Serialize sysfs changesSeunghun Han2018-03-081-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The check_interval file in /sys/devices/system/machinecheck/machinecheck<cpu number> directory is a global timer value for MCE polling. If it is changed by one CPU, mce_restart() broadcasts the event to other CPUs to delete and restart the MCE polling timer and __mcheck_cpu_init_timer() reinitializes the mce_timer variable. If more than one CPU writes a specific value to the check_interval file concurrently, mce_timer is not protected from such concurrent accesses and all kinds of explosions happen. Since only root can write to those sysfs variables, the issue is not a big deal security-wise. However, concurrent writes to these configuration variables is void of reason so the proper thing to do is to serialize the access with a mutex. Boris: - Make store_int_with_restart() use device_store_ulong() to filter out negative intervals - Limit min interval to 1 second - Correct locking - Massage commit message Signed-off-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180302202706.9434-1-kkamagui@gmail.com
| * | | x86/MCE: Save microcode revision in machine check recordsTony Luck2018-03-082-1/+4
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updating microcode used to be relatively rare. Now that it has become more common we should save the microcode version in a machine check record to make sure that those people looking at the error have this important information bundled with the rest of the logged information. [ Borislav: Simplify a bit. ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Yazen Ghannam <yazen.ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180301233449.24311-1-tony.luck@intel.com
* | | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2018-03-1113-17/+65
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Thomas Gleixner: "Another set of perf updates: - Fix a Skylake Uncore event format declaration - Prevent perf pipe mode from crahsing which was caused by a missing buffer allocation - Make the perf top popup message which tells the user that it uses fallback mode on older kernels a debug message. - Make perf context rescheduling work correcctly - Robustify the jump error drawing in perf browser mode so it does not try to create references to NULL initialized offset entries - Make trigger_on() robust so it does not enable the trigger before everything is set up correctly to handle it - Make perf auxtrace respect the --no-itrace option so it does not try to queue AUX data for decoding. - Prevent having different number of field separators in CVS output lines when a counter is not supported. - Make the perf kallsyms man page usage behave like it does for all other perf commands. - Synchronize the kernel headers" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix ctx_event_type in ctx_resched() perf tools: Fix trigger class trigger_on() perf auxtrace: Prevent decoding when --no-itrace perf stat: Fix CVS output format for non-supported counters tools headers: Sync x86's cpufeatures.h tools headers: Sync copy of kvm UAPI headers perf record: Fix crash in pipe mode perf annotate browser: Be more robust when drawing jump arrows perf top: Fix annoying fallback message on older kernels perf kallsyms: Fix the usage on the man page perf/x86/intel/uncore: Fix Skylake UPI event format
| * | | perf/core: Fix ctx_event_type in ctx_resched()Song Liu2018-03-091-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In ctx_resched(), EVENT_FLEXIBLE should be sched_out when EVENT_PINNED is added. However, ctx_resched() calculates ctx_event_type before checking this condition. As a result, pinned events will NOT get higher priority than flexible events. The following shows this issue on an Intel CPU (where ref-cycles can only use one hardware counter). 1. First start: perf stat -C 0 -e ref-cycles -I 1000 2. Then, in the second console, run: perf stat -C 0 -e ref-cycles:D -I 1000 The second perf uses pinned events, which is expected to have higher priority. However, because it failed in ctx_resched(). It is never run. This patch fixes this by calculating ctx_event_type after re-evaluating event_type. Reported-by: Ephraim Park <ephiepark@fb.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <jolsa@redhat.com> Cc: <kernel-team@fb.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 487f05e18aa4 ("perf/core: Optimize event rescheduling on active contexts") Link: http://lkml.kernel.org/r/20180306055504.3283731-1-songliubraving@fb.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | Merge tag 'perf-urgent-for-mingo-4.16-20180306' of ↵Ingo Molnar2018-03-0711-15/+61
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Be more robust when drawing arrows in the annotation TUI, avoiding a segfault when jump instructions have as a target addresses in functions other that the one currently being annotated. The full fix will come in the following days, when jumping to other functions will work as call instructions (Arnaldo Carvalho de Melo) - Prevent auxtrace_queues__process_index() from queuing AUX area data for decoding when the --no-itrace option has been used (Adrian Hunter) - Sync copy of kvm UAPI headers and x86's cpufeatures.h (Arnaldo Carvalho de Melo) - Fix 'perf stat' CSV output format for non-supported counters (Ilya Pronin) - Fix crash in 'perf record|perf report' pipe mode (Jiri Olsa) - Fix annoying 'perf top' overwrite fallback message on older kernels (Kan Liang) - Fix the usage on the 'perf kallsyms' man page (Sangwon Hong) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>