summaryrefslogtreecommitdiffstats
path: root/tools (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'docs-5.9' of git://git.lwn.net/linuxLinus Torvalds2020-08-051-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull documentation updates from Jonathan Corbet: "It's been a busy cycle for documentation - hopefully the busiest for a while to come. Changes include: - Some new Chinese translations - Progress on the battle against double words words and non-HTTPS URLs - Some block-mq documentation - More RST conversions from Mauro. At this point, that task is essentially complete, so we shouldn't see this kind of churn again for a while. Unless we decide to switch to asciidoc or something...:) - Lots of typo fixes, warning fixes, and more" * tag 'docs-5.9' of git://git.lwn.net/linux: (195 commits) scripts/kernel-doc: optionally treat warnings as errors docs: ia64: correct typo mailmap: add entry for <alobakin@marvell.com> doc/zh_CN: add cpu-load Chinese version Documentation/admin-guide: tainted-kernels: fix spelling mistake MAINTAINERS: adjust kprobes.rst entry to new location devices.txt: document rfkill allocation PCI: correct flag name docs: filesystems: vfs: correct flag name docs: filesystems: vfs: correct sync_mode flag names docs: path-lookup: markup fixes for emphasis docs: path-lookup: more markup fixes docs: path-lookup: fix HTML entity mojibake CREDITS: Replace HTTP links with HTTPS ones docs: process: Add an example for creating a fixes tag doc/zh_CN: add Chinese translation prefer section doc/zh_CN: add clearing-warn-once Chinese version doc/zh_CN: add admin-guide index doc:it_IT: process: coding-style.rst: Correct __maybe_unused compiler label futex: MAINTAINERS: Re-add selftests directory ...
| * selftests/vm/keys: fix a broken reference at protection_keys.cMauro Carvalho Chehab2020-06-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changeset 1eecbcdca2bd ("docs: move protection-keys.rst to the core-api book") from Jun 7, 2019 converted protection-keys.txt file to ReST. A recent change at protection_keys.c partially reverted such changeset, causing it to point to a non-existing file: - * Tests x86 Memory Protection Keys (see Documentation/core-api/protection-keys.rst) + * Tests Memory Protection Keys (see Documentation/vm/protection-keys.txt) It sounds to me that the changeset that introduced such change 4645e3563673 ("selftests/vm/pkeys: rename all references to pkru to a generic name") could also have other side effects, as it sounds that it was not generated against uptream code, but, instead, against a version older than Jun 7, 2019. Fixes: 4645e3563673 ("selftests/vm/pkeys: rename all references to pkru to a generic name") Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/cf65aa052669f55b9dc976a5c8026aef5840741d.1592895969.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
* | Merge tag 'x86-fsgsbase-2020-08-04' of ↵Linus Torvalds2020-08-054-6/+295
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fsgsbase from Thomas Gleixner: "Support for FSGSBASE. Almost 5 years after the first RFC to support it, this has been brought into a shape which is maintainable and actually works. This final version was done by Sasha Levin who took it up after Intel dropped the ball. Sasha discovered that the SGX (sic!) offerings out there ship rogue kernel modules enabling FSGSBASE behind the kernels back which opens an instantanious unpriviledged root hole. The FSGSBASE instructions provide a considerable speedup of the context switch path and enable user space to write GSBASE without kernel interaction. This enablement requires careful handling of the exception entries which go through the paranoid entry path as they can no longer rely on the assumption that user GSBASE is positive (as enforced via prctl() on non FSGSBASE enabled systemn). All other entries (syscalls, interrupts and exceptions) can still just utilize SWAPGS unconditionally when the entry comes from user space. Converting these entries to use FSGSBASE has no benefit as SWAPGS is only marginally slower than WRGSBASE and locating and retrieving the kernel GSBASE value is not a free operation either. The real benefit of RD/WRGSBASE is the avoidance of the MSR reads and writes. The changes come with appropriate selftests and have held up in field testing against the (sanitized) Graphene-SGX driver" * tag 'x86-fsgsbase-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86/fsgsbase: Fix Xen PV support x86/ptrace: Fix 32-bit PTRACE_SETREGS vs fsbase and gsbase selftests/x86/fsgsbase: Add a missing memory constraint selftests/x86/fsgsbase: Fix a comment in the ptrace_write_gsbase test selftests/x86: Add a syscall_arg_fault_64 test for negative GSBASE selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base write Documentation/x86/64: Add documentation for GS/FS addressing mode x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit x86/entry/64: Introduce the FIND_PERCPU_BASE macro x86/entry/64: Switch CR3 before SWAPGS in paranoid entry x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation x86/process/64: Use FSGSBASE instructions on thread copy and ptrace x86/process/64: Use FSBSBASE in switch_to() if available x86/process/64: Make save_fsgs_for_kvm() ready for FSGSBASE x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE ...
| * | x86/ptrace: Fix 32-bit PTRACE_SETREGS vs fsbase and gsbaseAndy Lutomirski2020-07-012-1/+246
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Debuggers expect that doing PTRACE_GETREGS, then poking at a tracee and maybe letting it run for a while, then doing PTRACE_SETREGS will put the tracee back where it was. In the specific case of a 32-bit tracer and tracee, the PTRACE_GETREGS/SETREGS data structure doesn't have fs_base or gs_base fields, so FSBASE and GSBASE fields are never stored anywhere. Everything used to still work because nonzero FS or GS would result full reloads of the segment registers when the tracee resumes, and the bases associated with FS==0 or GS==0 are irrelevant to 32-bit code. Adding FSGSBASE support broke this: when FSGSBASE is enabled, FSBASE and GSBASE are now restored independently of FS and GS for all tasks when context-switched in. This means that, if a 32-bit tracer restores a previous state using PTRACE_SETREGS but the tracee's pre-restore and post-restore bases don't match, then the tracee is resumed with the wrong base. Fix it by explicitly loading the base when a 32-bit tracer pokes FS or GS on a 64-bit kernel. Also add a test case. Fixes: 673903495c85 ("x86/process/64: Use FSBSBASE in switch_to() if available") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/229cc6a50ecbb701abd50fe4ddaf0eda888898cd.1593192140.git.luto@kernel.org
| * | selftests/x86/fsgsbase: Add a missing memory constraintAndy Lutomirski2020-07-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The manual call to set_thread_area() via int $0x80 was missing any indication that the descriptor was a pointer, causing gcc to occasionally generate wrong code. Add the missing constraint. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/432968af67259ca92d68b774a731aff468eae610.1593192140.git.luto@kernel.org
| * | selftests/x86/fsgsbase: Fix a comment in the ptrace_write_gsbase testAndy Lutomirski2020-07-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A comment was unclear. Fix it. Fixes: 5e7ec8578fa3 ("selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/901034a91a40169ec84f1f699ea86704dff762e4.1593192140.git.luto@kernel.org
| * | selftests/x86: Add a syscall_arg_fault_64 test for negative GSBASEAndy Lutomirski2020-06-221-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the kernel erroneously allows WRGSBASE and user code writes a negative value, paranoid_entry will get confused. Check for this by writing a negative value to GSBASE and doing SYSENTER with TF set. A successful run looks like: [RUN] SYSENTER with TF, invalid state, and GSBASE < 0 [SKIP] Illegal instruction A failed run causes a kernel hang, and I believe it's because we double-fault and then get a never ending series of page faults and, when we exhaust the double fault stack we double fault again, starting the process over. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/f4f71efc91b9eae5e3dae21c9aee1c70cf5f370e.1590620529.git.luto@kernel.org
| * | selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASEChang S. Bae2020-06-181-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This validates that GS selector and base are independently preserved in ptrace commands. Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200528201402.1708239-17-sashal@kernel.org
| * | selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base writeChang S. Bae2020-06-181-6/+15
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | The test validates that the selector is not changed when a ptracer writes the ptracee's GS base. Originally-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200528201402.1708239-16-sashal@kernel.org
* | Merge tag 'close-range-v5.9' of ↵Linus Torvalds2020-08-054-0/+236
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull close_range() implementation from Christian Brauner: "This adds the close_range() syscall. It allows to efficiently close a range of file descriptors up to all file descriptors of a calling task. This is coordinated with the FreeBSD folks which have copied our version of this syscall and in the meantime have already merged it in April 2019: https://reviews.freebsd.org/D21627 https://svnweb.freebsd.org/base?view=revision&revision=359836 The syscall originally came up in a discussion around the new mount API and making new file descriptor types cloexec by default. During this discussion, Al suggested the close_range() syscall. First, it helps to close all file descriptors of an exec()ing task. This can be done safely via (quoting Al's example from [1] verbatim): /* that exec is sensitive */ unshare(CLONE_FILES); /* we don't want anything past stderr here */ close_range(3, ~0U); execve(....); The code snippet above is one way of working around the problem that file descriptors are not cloexec by default. This is aggravated by the fact that we can't just switch them over without massively regressing userspace. For a whole class of programs having an in-kernel method of closing all file descriptors is very helpful (e.g. demons, service managers, programming language standard libraries, container managers etc.). Second, it allows userspace to avoid implementing closing all file descriptors by parsing through /proc/<pid>/fd/* and calling close() on each file descriptor and other hacks. From looking at various large(ish) userspace code bases this or similar patterns are very common in service managers, container runtimes, and programming language runtimes/standard libraries such as Python or Rust. In addition, the syscall will also work for tasks that do not have procfs mounted and on kernels that do not have procfs support compiled in. In such situations the only way to make sure that all file descriptors are closed is to call close() on each file descriptor up to UINT_MAX or RLIMIT_NOFILE, OPEN_MAX trickery. Based on Linus' suggestion close_range() also comes with a new flag CLOSE_RANGE_UNSHARE to more elegantly handle file descriptor dropping right before exec. This would usually be expressed in the sequence: unshare(CLONE_FILES); close_range(3, ~0U); as pointed out by Linus it might be desirable to have this be a part of close_range() itself under a new flag CLOSE_RANGE_UNSHARE which gets especially handy when we're closing all file descriptors above a certain threshold. Test-suite as always included" * tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add CLOSE_RANGE_UNSHARE tests close_range: add CLOSE_RANGE_UNSHARE tests: add close_range() tests arch: wire-up close_range() open: add close_range()
| * | tests: add CLOSE_RANGE_UNSHARE testsChristian Brauner2020-06-171-0/+137
| | | | | | | | | | | | Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
| * | tests: add close_range() testsChristian Brauner2020-06-174-0/+99
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds basic tests for the new close_range() syscall. - test that no invalid flags can be passed - test that a range of file descriptors is correctly closed - test that a range of file descriptors is correctly closed if there there are already closed file descriptors in the range - test that max_fd is correctly capped to the current fdtable maximum Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jann Horn <jannh@google.com> Cc: David Howells <dhowells@redhat.com> Cc: Dmitry V. Levin <ldv@altlinux.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: linux-api@vger.kernel.org Cc: linux-kselftest@vger.kernel.org
* | Merge tag 'cap-checkpoint-restore-v5.9' of ↵Linus Torvalds2020-08-053-1/+186
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull checkpoint-restore updates from Christian Brauner: "This enables unprivileged checkpoint/restore of processes. Given that this work has been going on for quite some time the first sentence in this summary is hopefully more exciting than the actual final code changes required. Unprivileged checkpoint/restore has seen a frequent increase in interest over the last two years and has thus been one of the main topics for the combined containers & checkpoint/restore microconference since at least 2018 (cf. [1]). Here are just the three most frequent use-cases that were brought forward: - The JVM developers are integrating checkpoint/restore into a Java VM to significantly decrease the startup time. - In high-performance computing environment a resource manager will typically be distributing jobs where users are always running as non-root. Long-running and "large" processes with significant startup times are supposed to be checkpointed and restored with CRIU. - Container migration as a non-root user. In all of these scenarios it is either desirable or required to run without CAP_SYS_ADMIN. The userspace implementation of checkpoint/restore CRIU already has the pull request for supporting unprivileged checkpoint/restore up (cf. [2]). To enable unprivileged checkpoint/restore a new dedicated capability CAP_CHECKPOINT_RESTORE is introduced. This solution has last been discussed in 2019 in a talk by Google at Linux Plumbers (cf. [1] "Update on Task Migration at Google Using CRIU") with Adrian and Nicolas providing the implementation now over the last months. In essence, this allows the CRIU binary to be installed with the CAP_CHECKPOINT_RESTORE vfs capability set thereby enabling unprivileged users to restore processes. To make this possible the following permissions are altered: - Selecting a specific PID via clone3() set_tid relaxed from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE. - Selecting a specific PID via /proc/sys/kernel/ns_last_pid relaxed from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE. - Accessing /proc/pid/map_files relaxed from init userns CAP_SYS_ADMIN to init userns CAP_CHECKPOINT_RESTORE. - Changing /proc/self/exe from userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE. Of these four changes the /proc/self/exe change deserves a few words because the reasoning behind even restricting /proc/self/exe changes in the first place is just full of historical quirks and tracking this down was a questionable version of fun that I'd like to spare others. In short, it is trivial to change /proc/self/exe as an unprivileged user, i.e. without userns CAP_SYS_ADMIN right now. Either via ptrace() or by simply intercepting the elf loader in userspace during exec. Nicolas was nice enough to even provide a POC for the latter (cf. [3]) to illustrate this fact. The original patchset which introduced PR_SET_MM_MAP had no permissions around changing the exe link. They too argued that it is trivial to spoof the exe link already which is true. The argument brought up against this was that the Tomoyo LSM uses the exe link in tomoyo_manager() to detect whether the calling process is a policy manager. This caused changing the exe links to be guarded by userns CAP_SYS_ADMIN. All in all this rather seems like a "better guard it with something rather than nothing" argument which imho doesn't qualify as a great security policy. Again, because spoofing the exe link is possible for the calling process so even if this were security relevant it was broken back then and would be broken today. So technically, dropping all permissions around changing the exe link would probably be possible and would send a clearer message to any userspace that relies on /proc/self/exe for security reasons that they should stop doing this but for now we're only relaxing the exe link permissions from userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE. There's a final uapi change in here. Changing the exe link used to accidently return EINVAL when the caller lacked the necessary permissions instead of the more correct EPERM. This pr contains a commit fixing this. I assume that userspace won't notice or care and if they do I will revert this commit. But since we are changing the permissions anyway it seems like a good opportunity to try this fix. With these changes merged unprivileged checkpoint/restore will be possible and has already been tested by various users" [1] LPC 2018 1. "Task Migration at Google Using CRIU" https://www.youtube.com/watch?v=yI_1cuhoDgA&t=12095 2. "Securely Migrating Untrusted Workloads with CRIU" https://www.youtube.com/watch?v=yI_1cuhoDgA&t=14400 LPC 2019 1. "CRIU and the PID dance" https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=2m48s 2. "Update on Task Migration at Google Using CRIU" https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=1h2m8s [2] https://github.com/checkpoint-restore/criu/pull/1155 [3] https://github.com/nviennot/run_as_exe * tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: selftests: add clone3() CAP_CHECKPOINT_RESTORE test prctl: exe link permission error changed from -EINVAL to -EPERM prctl: Allow local CAP_CHECKPOINT_RESTORE to change /proc/self/exe proc: allow access in init userns for map_files with CAP_CHECKPOINT_RESTORE pid_namespace: use checkpoint_restore_ns_capable() for ns_last_pid pid: use checkpoint_restore_ns_capable() for set_tid capabilities: Introduce CAP_CHECKPOINT_RESTORE
| * | selftests: add clone3() CAP_CHECKPOINT_RESTORE testAdrian Reber2020-07-203-1/+186
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a test that changes its UID, uses capabilities to get CAP_CHECKPOINT_RESTORE and uses clone3() with set_tid to create a process with a given PID as non-root. Signed-off-by: Adrian Reber <areber@redhat.com> Link: https://lore.kernel.org/r/20200719100418.2112740-8-areber@redhat.com [christian.brauner@ubuntu.com: use TH_LOG() instead of ksft_print_msg()] Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
* | | Merge tag 'threads-v5.9' of ↵Linus Torvalds2020-08-042-0/+80
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull thread updates from Christian Brauner: "This contains the changes to add the missing support for attaching to time namespaces via pidfds. Last cycle setns() was changed to support attaching to multiple namespaces atomically. This requires all namespaces to have a point of no return where they can't fail anymore. Specifically, <namespace-type>_install() is allowed to perform permission checks and install the namespace into the new struct nsset that it has been given but it is not allowed to make visible changes to the affected task. Once <namespace-type>_install() returns, anything that the given namespace type additionally requires to be setup needs to ideally be done in a function that can't fail or if it fails the failure must be non-fatal. For time namespaces the relevant functions that fell into this category were timens_set_vvar_page() and vdso_join_timens(). The latter could still fail although it didn't need to. This function is only implemented for vdso_join_timens() in current mainline. As discussed on-list (cf. [1]), in order to make setns() support time namespaces when attaching to multiple namespaces at once properly we changed vdso_join_timens() to always succeed. So vdso_join_timens() replaces the mmap_write_lock_killable() with mmap_read_lock(). Please note that arm is about to grow vdso support for time namespaces (possibly this merge window). We've synced on this change and arm64 also uses mmap_read_lock(), i.e. makes vdso_join_timens() a function that can't fail. Once the changes here and the arm64 changes have landed, vdso_join_timens() should be turned into a void function so it's obvious to callers and implementers on other architectures that the expectation is that it can't fail. We didn't do this right away because it would've introduced unnecessary merge conflicts between the two trees for no major gain. As always, tests included" [1]: https://lore.kernel.org/lkml/20200611110221.pgd3r5qkjrjmfqa2@wittgenstein * tag 'threads-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add CLONE_NEWTIME setns tests nsproxy: support CLONE_NEWTIME with setns() timens: add timens_commit() helper timens: make vdso_join_timens() always succeed
| * | | tests: add CLONE_NEWTIME setns testsChristian Brauner2020-07-082-0/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that pidfds support CLONE_NEWTIME as well enable testing them in the setns() testuite. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Serge Hallyn <serge@hallyn.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Dmitry Safonov <dima@arista.com> Cc: Andrei Vagin <avagin@gmail.com> Link: https://lore.kernel.org/r/20200706154912.3248030-5-christian.brauner@ubuntu.com
* | | | Merge tag 'seccomp-v5.9-rc1' of ↵Linus Torvalds2020-08-045-232/+573
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: "There are a bunch of clean ups and selftest improvements along with two major updates to the SECCOMP_RET_USER_NOTIF filter return: EPOLLHUP support to more easily detect the death of a monitored process, and being able to inject fds when intercepting syscalls that expect an fd-opening side-effect (needed by both container folks and Chrome). The latter continued the refactoring of __scm_install_fd() started by Christoph, and in the process found and fixed a handful of bugs in various callers. - Improved selftest coverage, timeouts, and reporting - Add EPOLLHUP support for SECCOMP_RET_USER_NOTIF (Christian Brauner) - Refactor __scm_install_fd() into __receive_fd() and fix buggy callers - Introduce 'addfd' command for SECCOMP_RET_USER_NOTIF (Sargun Dhillon)" * tag 'seccomp-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits) selftests/seccomp: Test SECCOMP_IOCTL_NOTIF_ADDFD seccomp: Introduce addfd ioctl to seccomp user notifier fs: Expand __receive_fd() to accept existing fd pidfd: Replace open-coded receive_fd() fs: Add receive_fd() wrapper for __receive_fd() fs: Move __scm_install_fd() to __receive_fd() net/scm: Regularize compat handling of scm_detach_fds() pidfd: Add missing sock updates for pidfd_getfd() net/compat: Add missing sock updates for SCM_RIGHTS selftests/seccomp: Check ENOSYS under tracing selftests/seccomp: Refactor to use fixture variants selftests/harness: Clean up kern-doc for fixtures seccomp: Use -1 marker for end of mode 1 syscall list seccomp: Fix ioctl number for SECCOMP_IOCTL_NOTIF_ID_VALID selftests/seccomp: Rename user_trap_syscall() to user_notif_syscall() selftests/seccomp: Make kcmp() less required seccomp: Use pr_fmt selftests/seccomp: Improve calibration loop selftests/seccomp: use 90s as timeout selftests/seccomp: Expand benchmark to per-filter measurements ...
| * | | | selftests/seccomp: Test SECCOMP_IOCTL_NOTIF_ADDFDSargun Dhillon2020-07-151-0/+229
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Test whether we can add file descriptors in response to notifications. This injects the file descriptors via notifications, and then uses kcmp to determine whether or not it has been successful. It also includes some basic sanity checking for arguments. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Palmer <palmer@google.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Robert Sesek <rsesek@google.com> Cc: Tycho Andersen <tycho@tycho.ws> Cc: Matt Denton <mpdenton@google.com> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/20200603011044.7972-5-sargun@sargun.me Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | selftests/seccomp: Check ENOSYS under tracingKees Cook2020-07-111-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There should be no difference between -1 and other negative syscalls while tracing. Cc: Keno Fischer <keno@juliacomputing.com> Tested-by: Will Deacon <will@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | selftests/seccomp: Refactor to use fixture variantsKees Cook2020-07-111-157/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the selftest harness has variants, use them to eliminate a bunch of copy/paste duplication. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Will Deacon <will@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | selftests/harness: Clean up kern-doc for fixturesKees Cook2020-07-111-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The FIXTURE*() macro kern-doc examples had the wrong names for the C code examples associated with them. Fix those and clarify that FIXTURE_DATA() usage should be avoided. Cc: Shuah Khan <shuah@kernel.org> Fixes: 74bc7c97fa88 ("kselftest: add fixture variants") Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | seccomp: Fix ioctl number for SECCOMP_IOCTL_NOTIF_ID_VALIDKees Cook2020-07-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When SECCOMP_IOCTL_NOTIF_ID_VALID was first introduced it had the wrong direction flag set. While this isn't a big deal as nothing currently enforces these bits in the kernel, it should be defined correctly. Fix the define and provide support for the old command until it is no longer needed for backward compatibility. Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace") Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | selftests/seccomp: Rename user_trap_syscall() to user_notif_syscall()Kees Cook2020-07-111-23/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The user_trap_syscall() helper creates a filter with SECCOMP_RET_USER_NOTIF. To avoid confusion with SECCOMP_RET_TRAP, rename the helper to user_notif_syscall(). Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: Andrii Nakryiko <andriin@fb.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@chromium.org> Cc: linux-kselftest@vger.kernel.org Cc: netdev@vger.kernel.org Cc: bpf@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | selftests/seccomp: Make kcmp() less requiredKees Cook2020-07-111-20/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The seccomp tests are a bit noisy without CONFIG_CHECKPOINT_RESTORE (due to missing the kcmp() syscall). The seccomp tests are more accurate with kcmp(), but it's not strictly required. Refactor the tests to use alternatives (comparing fd numbers), and provide a central test for kcmp() so there is a single SKIP instead of many. Continue to produce warnings for the other tests, though. Additionally adds some more bad flag EINVAL tests to the addfd selftest. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: Andrii Nakryiko <andriin@fb.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@chromium.org> Cc: linux-kselftest@vger.kernel.org Cc: netdev@vger.kernel.org Cc: bpf@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | selftests/seccomp: Improve calibration loopKees Cook2020-07-111-18/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The seccomp benchmark calibration loop did not need to take so long. Instead, use a simple 1 second timeout and multiply up to target. It does not need to be accurate. Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | selftests/seccomp: use 90s as timeoutThadeu Lima de Souza Cascardo2020-07-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As seccomp_benchmark tries to calibrate how many samples will take more than 5 seconds to execute, it may end up picking up a number of samples that take 10 (but up to 12) seconds. As the calibration will take double that time, it takes around 20 seconds. Then, it executes the whole thing again, and then once more, with some added overhead. So, the thing might take more than 40 seconds, which is too close to the 45s timeout. That is very dependent on the system where it's executed, so may not be observed always, but it has been observed on x86 VMs. Using a 90s timeout seems safe enough. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Link: https://lore.kernel.org/r/20200601123202.1183526-1-cascardo@canonical.com Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | selftests/seccomp: Expand benchmark to per-filter measurementsKees Cook2020-07-112-9/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's useful to see how much (at a minimum) each filter adds to the syscall overhead. Add additional calculations. Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | selftests/seccomp: Check for EPOLLHUP for user_notifChristian Brauner2020-07-111-0/+136
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This verifies we're correctly notified when a seccomp filter becomes unused when a notifier is in use. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20200531115031.391515-4-christian.brauner@ubuntu.com Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | selftests/seccomp: Set NNP for TSYNC ESRCH flag testKees Cook2020-07-111-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The TSYNC ESRCH flag test will fail for regular users because NNP was not set yet. Add NNP setting. Fixes: 51891498f2da ("seccomp: allow TSYNC and USER_NOTIF together") Cc: stable@vger.kernel.org Reviewed-by: Tycho Andersen <tycho@tycho.ws> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | selftests/seccomp: Add SKIPs for failed unshare()Kees Cook2020-07-112-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Running the seccomp tests as a regular user shouldn't just fail tests that require CAP_SYS_ADMIN (for getting a PID namespace). Instead, detect those cases and SKIP them. Additionally, gracefully SKIP missing CONFIG_USER_NS (and add to "config" since we'd prefer to actually test this case). Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | selftests/seccomp: Rename XFAIL to SKIPKees Cook2020-07-111-4/+9
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | The kselftests will be renaming XFAIL to SKIP in the test harness, and to avoid painful conflicts, rename XFAIL to SKIP now in a future-proofed way. Signed-off-by: Kees Cook <keescook@chromium.org>
* | | | Merge tag 'uninit-macro-v5.9-rc1' of ↵Linus Torvalds2020-08-042-4/+0
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull uninitialized_var() macro removal from Kees Cook: "This is long overdue, and has hidden too many bugs over the years. The series has several "by hand" fixes, and then a trivial treewide replacement. - Clean up non-trivial uses of uninitialized_var() - Update documentation and checkpatch for uninitialized_var() removal - Treewide removal of uninitialized_var()" * tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: compiler: Remove uninitialized_var() macro treewide: Remove uninitialized_var() usage checkpatch: Remove awareness of uninitialized_var() macro mm/debug_vm_pgtable: Remove uninitialized_var() usage f2fs: Eliminate usage of uninitialized_var() macro media: sur40: Remove uninitialized_var() usage KVM: PPC: Book3S PR: Remove uninitialized_var() usage clk: spear: Remove uninitialized_var() usage clk: st: Remove uninitialized_var() usage spi: davinci: Remove uninitialized_var() usage ide: Remove uninitialized_var() usage rtlwifi: rtl8192cu: Remove uninitialized_var() usage b43: Remove uninitialized_var() usage drbd: Remove uninitialized_var() usage x86/mm/numa: Remove uninitialized_var() usage docs: deprecated.rst: Add uninitialized_var()
| * | | | compiler: Remove uninitialized_var() macroKees Cook2020-07-162-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using uninitialized_var() is dangerous as it papers over real bugs[1] (or can in the future), and suppresses unrelated compiler warnings (e.g. "unused variable"). If the compiler thinks it is uninitialized, either simply initialize the variable or make compiler changes. As recommended[2] by[3] Linus[4], remove the macro. With the recent change to disable -Wmaybe-uninitialized in v5.7 in commit 78a5255ffb6a ("Stop the ad-hoc games with -Wno-maybe-initialized"), this is likely the best time to make this treewide change. [1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/ [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/ [3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/ [4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/ Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Bart van Assche <bvanassche@acm.org> Reviewed-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org>
* | | | | Merge tag 'acpi-5.9-rc1' of ↵Linus Torvalds2020-08-041-1/+1
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These eliminate significant AML processing overhead related to using operation regions in system memory, update the ACPICA code in the kernel to upstream revision 20200717 (including a fix to prevent operation region reference counts from overflowing in some cases), remove the last bits of the (long deprecated) ACPI procfs interface and do some assorted cleanups. Specifics: - Eliminate significant AML processing overhead related to using operation regions in system memory by reworking the management of memory mappings in the ACPI code to defer unmap operations (to do them outside of the ACPICA locks, among other things) and making the memory operation reagion handler avoid releasing memory mappings created by it too early (Rafael Wysocki). - Update the ACPICA code in the kernel to upstream revision 20200717: * Prevent operation region reference counts from overflowing in some cases (Erik Kaneda). * Replace one-element array with flexible-array (Gustavo A. R. Silva). - Fix ACPI PCI hotplug reference counting (Rafael Wysocki). - Drop last bits of the ACPI procfs interface (Thomas Renninger). - Drop some redundant checks from the code parsing ACPI tables related to NUMA (Hanjun Guo). - Avoid redundant object evaluation in the ACPI device properties handling code (Heikki Krogerus). - Avoid unecessary memory overhead related to storing the signatures of the ACPI tables recognized by the kernel (Ard Biesheuvel). - Add missing newline characters when printing module parameter values in some places (Xiongfeng Wang). - Update the link to the ACPI specifications in some places (Tiezhu Yang). - Use the fallthrough pseudo-keyword in the ACPI code (Gustavo A. R. Silva). - Drop redundant variable initialization from the APEI code (Colin Ian King). - Drop uninitialized_var() from the ACPI PAD driver (Jason Yan). - Replace HTTP links with HTTPS ones in the ACPI code (Alexander A. Klimov)" * tag 'acpi-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (22 commits) ACPI: APEI: remove redundant assignment to variable rc ACPI: NUMA: Remove the useless 'node >= MAX_NUMNODES' check ACPI: NUMA: Remove the useless sub table pointer check ACPI: tables: Remove the duplicated checks for acpi_parse_entries_array() ACPICA: Update version to 20200717 ACPICA: Do not increment operation_region reference counts for field units ACPICA: Replace one-element array with flexible-array ACPI: Replace HTTP links with HTTPS ones ACPI: Use valid link to the ACPI specification ACPI: OSL: Clean up the removal of unused memory mappings ACPI: OSL: Use deferred unmapping in acpi_os_unmap_iomem() ACPI: OSL: Use deferred unmapping in acpi_os_unmap_generic_address() ACPICA: Preserve memory opregion mappings ACPI: OSL: Implement deferred unmapping of ACPI memory ACPI: Use fallthrough pseudo-keyword PCI: hotplug: ACPI: Fix context refcounting in acpiphp_grab_context() ACPI: tables: avoid relocations for table signature array ACPI: PAD: Eliminate usage of uninitialized_var() macro ACPI: sysfs: add newlines when printing module parameters ACPI: EC: add newline when printing 'ec_event_clearing' module parameter ...
| | \ \ \ \
| | \ \ \ \
| | \ \ \ \
| | \ \ \ \
| *---. \ \ \ \ Merge branches 'acpi-mm', 'acpi-tables', 'acpi-apei' and 'acpi-misc'Rafael J. Wysocki2020-08-031-1/+1
| |\ \ \ \ \ \ \ | | | |_|/ / / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * acpi-mm: ACPI: OSL: Clean up the removal of unused memory mappings ACPI: OSL: Use deferred unmapping in acpi_os_unmap_iomem() ACPI: OSL: Use deferred unmapping in acpi_os_unmap_generic_address() ACPICA: Preserve memory opregion mappings ACPI: OSL: Implement deferred unmapping of ACPI memory * acpi-tables: ACPI: NUMA: Remove the useless 'node >= MAX_NUMNODES' check ACPI: NUMA: Remove the useless sub table pointer check ACPI: tables: Remove the duplicated checks for acpi_parse_entries_array() ACPI: tables: avoid relocations for table signature array * acpi-apei: ACPI: APEI: remove redundant assignment to variable rc * acpi-misc: ACPI: Replace HTTP links with HTTPS ones ACPI: Use valid link to the ACPI specification ACPI: Use fallthrough pseudo-keyword
| | | | * | | | ACPI: Use valid link to the ACPI specificationTiezhu Yang2020-07-271-1/+1
| | | | | |/ / | | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, acpi.info is an invalid link to access ACPI specification, the new valid link is https://uefi.org/specifications. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | | | | | | Merge tag 'pm-5.9-rc1' of ↵Linus Torvalds2020-08-045-112/+159
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "The most significant change here is the extension of the Energy Model to cover non-CPU devices (as well as CPUs) from Lukasz Luba. There is also some new hardware support (Ice Lake server idle states table for intel_idle, Sapphire Rapids and Power Limit 4 support in the RAPL driver), some new functionality in the existing drivers (eg. a new switch to disable/enable CPU energy-efficiency optimizations in intel_pstate, delayed timers in devfreq), some assorted fixes (cpufreq core, intel_pstate, intel_idle) and cleanups (eg. cpuidle-psci, devfreq), including the elimination of W=1 build warnings from cpufreq done by Lee Jones. Specifics: - Make the Energy Model cover non-CPU devices (Lukasz Luba). - Add Ice Lake server idle states table to the intel_idle driver and eliminate a redundant static variable from it (Chen Yu, Rafael Wysocki). - Eliminate all W=1 build warnings from cpufreq (Lee Jones). - Add support for Sapphire Rapids and for Power Limit 4 to the Intel RAPL power capping driver (Sumeet Pawnikar, Zhang Rui). - Fix function name in kerneldoc comments in the idle_inject power capping driver (Yangtao Li). - Fix locking issues with cpufreq governors and drop a redundant "weak" function definition from cpufreq (Viresh Kumar). - Rearrange cpufreq to register non-modular governors at the core_initcall level and allow the default cpufreq governor to be specified in the kernel command line (Quentin Perret). - Extend, fix and clean up the intel_pstate driver (Srinivas Pandruvada, Rafael Wysocki): * Add a new sysfs attribute for disabling/enabling CPU energy-efficiency optimizations in the processor. * Make the driver avoid enabling HWP if EPP is not supported. * Allow the driver to handle numeric EPP values in the sysfs interface and fix the setting of EPP via sysfs in the active mode. * Eliminate a static checker warning and clean up a kerneldoc comment. - Clean up some variable declarations in the powernv cpufreq driver (Wei Yongjun). - Fix up the ->enter_s2idle callback definition to cover the case when it points to the same function as ->idle correctly (Neal Liu). - Rearrange and clean up the PSCI cpuidle driver (Ulf Hansson). - Make the PM core emit "changed" uevent when adding/removing the "wakeup" sysfs attribute of devices (Abhishek Pandit-Subedi). - Add a helper macro for declaring PM callbacks and use it in the MMC jz4740 driver (Paul Cercueil). - Fix white space in some places in the hibernate code and make the system-wide PM code use "const char *" where appropriate (Xiang Chen, Alexey Dobriyan). - Add one more "unsafe" helper macro to the freezer to cover the NFS use case (He Zhe). - Change the language in the generic PM domains framework to use parent/child terminology and clean up a typo and some comment fromatting in that code (Kees Cook, Geert Uytterhoeven). - Update the operating performance points OPP framework (Lukasz Luba, Andrew-sh.Cheng, Valdis Kletnieks): * Refactor dev_pm_opp_of_register_em() and update related drivers. * Add a missing function export. * Allow disabled OPPs in dev_pm_opp_get_freq(). - Update devfreq core and drivers (Chanwoo Choi, Lukasz Luba, Enric Balletbo i Serra, Dmitry Osipenko, Kieran Bingham, Marc Zyngier): * Add support for delayed timers to the devfreq core and make the Samsung exynos5422-dmc driver use it. * Unify sysfs interface to use "df-" as a prefix in instance names consistently. * Fix devfreq_summary debugfs node indentation. * Add the rockchip,pmu phandle to the rk3399_dmc driver DT bindings. * List Dmitry Osipenko as the Tegra devfreq driver maintainer. * Fix typos in the core devfreq code. - Update the pm-graph utility to version 5.7 including a number of fixes related to suspend-to-idle (Todd Brandt). - Fix coccicheck errors and warnings in the cpupower utility (Shuah Khan). - Replace HTTP links with HTTPs ones in multiple places (Alexander A. Klimov)" * tag 'pm-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (71 commits) cpuidle: ACPI: fix 'return' with no value build warning cpufreq: intel_pstate: Fix EPP setting via sysfs in active mode cpufreq: intel_pstate: Rearrange the storing of new EPP values intel_idle: Customize IceLake server support PM / devfreq: Fix the wrong end with semicolon PM / devfreq: Fix indentaion of devfreq_summary debugfs node PM / devfreq: Clean up the devfreq instance name in sysfs attr memory: samsung: exynos5422-dmc: Add module param to control IRQ mode memory: samsung: exynos5422-dmc: Adjust polling interval and uptreshold memory: samsung: exynos5422-dmc: Use delayed timer as default PM / devfreq: Add support delayed timer for polling mode dt-bindings: devfreq: rk3399_dmc: Add rockchip,pmu phandle PM / devfreq: tegra: Add Dmitry as a maintainer PM / devfreq: event: Fix trivial spelling PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent cpuidle: change enter_s2idle() prototype cpuidle: psci: Prevent domain idlestates until consumers are ready cpuidle: psci: Convert PM domain to platform driver cpuidle: psci: Fix error path via converting to a platform driver cpuidle: psci: Fail cpuidle registration if set OSI mode failed ...
| * | | | | | | pm-graph v5.7 - important s2idle fixesTodd Brandt2020-07-272-102/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Important fixes: - in s2idle, use timekeeping_freeze trace mark instead of machine_suspend to denote entry into s2idle mode. - in s2idle, use machine_suspend trace mark to create a new virtual device called "s2idle_enter_<n>x". It denotes an s2idle_enter call loop of <n> iterations where s2idle was never actually achieved. It isn't counted as "freeze time" in the header. - in s2idle, only show multiple freeze times if s2idle went in and out of resume_noirq. Otherwise multiple freezes are shown with "waking" time subtracted (waking time is time spent outside s2idle dealing with wakeups). - in s2idle summaries, include "FREEZEWAKE" as an issue when at least 1ms is spent waking from s2idle. A clean run should only wake for the rtc timer. - add support for device callbacks with matching names in the same phase. In rare cases some devices register multiple callbacks from separate drivers using the same name. Without this fix only one is shown. - add kparamsfmt string back to fix bootgraph General updates: - when suspend_machine is missing, error says "failed in suspend_machine" - extract target count/time and add to summary title if -multi used - include any instances of "timeout" in dmesg as issues to be logged. - fix ftrace parse to handle any number of flags (instead of just 4). - remove sync/async_device string from device detail, remains in hover. - when using callgraph (-f) add driver name to callgraph titles. Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | | Merge tag 'linux-cpupower-5.9-rc1' of ↵Rafael J. Wysocki2020-07-273-10/+10
| |\ \ \ \ \ \ \ | | |/ / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux Pull cpupower utility updates for v5.9 from Shuah Khan: "This cpupower update for Linux 5.9-rc1 consists of 2 fixes to coccicheck warnings and one change to replacing HTTP links with HTTPS ones." * tag 'linux-cpupower-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux: cpupower: Replace HTTP links with HTTPS ones cpupower: Fix NULL but dereferenced coccicheck errors cpupower: Fix comparing pointer to 0 coccicheck warns
| | * | | | | | cpupower: Replace HTTP links with HTTPS onesAlexander A. Klimov2020-07-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
| | * | | | | | cpupower: Fix NULL but dereferenced coccicheck errorsShuah Khan2020-07-071-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix NULL but dereferenced coccicheck errors found by: make coccicheck MODE=report M=tools/power/cpupower tools/power/cpupower/lib/cpufreq.c:384:19-23: ERROR: first is NULL but dereferenced. tools/power/cpupower/lib/cpufreq.c:440:19-23: ERROR: first is NULL but dereferenced. tools/power/cpupower/lib/cpufreq.c:308:19-23: ERROR: first is NULL but dereferenced. tools/power/cpupower/lib/cpufreq.c:753:19-23: ERROR: first is NULL but dereferenced. Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
| | * | | | | | cpupower: Fix comparing pointer to 0 coccicheck warnsShuah Khan2020-07-071-3/+3
| | | |_|_|/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix cocciccheck wanrns found by: make coccicheck MODE=report M=tools/power/cpupower/ tools/power/cpupower/utils/helpers/bitmask.c:29:12-13: WARNING comparing pointer to 0, suggest !E tools/power/cpupower/utils/helpers/bitmask.c:29:12-13: WARNING comparing pointer to 0 tools/power/cpupower/utils/helpers/bitmask.c:43:12-13: WARNING comparing pointer to 0 Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* | | | | | | Merge tag 'platform-drivers-x86-v5.9-1' of ↵Linus Torvalds2020-08-041-20/+61
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.infradead.org/linux-platform-drivers-x86 Pull x86 platform driver updates from Andy Shevchenko: - ASUS WMI driver honors BAT1 name of the battery (quite a few new laptops are using it) - Dell WMI driver supports new key codes and backlight events - ThinkPad ACPI driver now may use standard charge threshold interface, it also has been updated to provide Laptop or Desktop mode to the user - Intel Speed Select Technology gained support on Sapphire Rapids platform - Regular update of Speed Select Technology tools - Mellanox has been updated to support complex attributes - PMC core driver has been fixed to show correct names for LPM0 register - HTTP links were replaced by HTTPS ones where it applies - Miscellaneous fixes and cleanups here and there * tag 'platform-drivers-x86-v5.9-1' of git://git.infradead.org/linux-platform-drivers-x86: (42 commits) platform/x86: asus-nb-wmi: Drop duplicate DMI quirk structures platform/x86: thinkpad_acpi: Make some symbols static platform/x86: thinkpad_acpi: add documentation for battery charge control platform/x86: thinkpad_acpi: use standard charge control attribute names platform/x86: thinkpad_acpi: remove unused defines platform/x86: ISST: drop a duplicated word in isst_if.h tools/power/x86/intel-speed-select: Update version for v5.9 tools/power/x86/intel-speed-select: Add retries for mail box commands tools/power/x86/intel-speed-select: Add option to delay mbox commands tools/power/x86/intel-speed-select: Ignore -o option processing on error tools/power/x86/intel-speed-select: Change path for caching topology info platform/x86: acerhdf: Replace HTTP links with HTTPS ones platform/x86: apple-gmux: Replace HTTP links with HTTPS ones platform/x86: pcengines-apuv2: revert wiring up simswitch GPIO as LED platform/x86: mlx-platform: Extend FAN platform data description platform_data/mlxreg: Add presence register field for FAN devices Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces platform/mellanox: mlxreg-io: Add support for complex attributes platform/x86: mlx-platform: Add more definitions for system attributes platform_data/mlxreg: Add support for complex attributes ...
| * | | | | | | tools/power/x86/intel-speed-select: Update version for v5.9Srinivas Pandruvada2020-07-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update version for changes released with v5.9 kernel release. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
| * | | | | | | tools/power/x86/intel-speed-select: Add retries for mail box commandsSrinivas Pandruvada2020-07-161-16/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Retry mail box command on failure. The default retry count is 3. This can be changed by "-r|--retry" options. This helps during early bring up of platforms. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
| * | | | | | | tools/power/x86/intel-speed-select: Add option to delay mbox commandsSrinivas Pandruvada2020-07-161-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add option "p|--pause" to introduce delay between two mail box commands for test purpose. This delay can be specified in milliseconds. By default there is no delay between two mailbox commands. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
| * | | | | | | tools/power/x86/intel-speed-select: Ignore -o option processing on errorSrinivas Pandruvada2020-07-161-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When for some reason, CONFIG_TDP_GET_CORE_MASK mailbox command fails, then don't continue online/offline operation for perf-profile set-config-level with "-o" option. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
| * | | | | | | tools/power/x86/intel-speed-select: Change path for caching topology infoSrinivas Pandruvada2020-07-161-2/+4
| | |/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to cache the topology info to a file, which is not preserved across boot cycle. The current storage in /tmp is getting preserved. So change the path from /tmp/isst_cpu_topology.dat to /var/run/isst_cpu_topology.dat. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
* | | | | | | Merge tag 'x86-fpu-2020-08-03' of ↵Linus Torvalds2020-08-045-0/+119
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 FPU selftest from Ingo Molnar: "Add the /sys/kernel/debug/selftest_helpers/test_fpu FPU self-test" * tag 'x86-fpu-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests/fpu: Add an FPU selftest
| * | | | | | | selftests/fpu: Add an FPU selftestPetteri Aimonen2020-06-295-0/+119
| | |_|_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a selftest for the usage of FPU code in kernel mode. Currently only implemented for x86. In the future, kernel FPU testing could be unified between the different architectures supporting it. [ bp: - Split out from a conglomerate patch, put comments over statements. - run the test only on debugfs write. - Add bare-minimum run_test_fpu.sh, run 1000 iterations on all CPUs by default. - Add conditionally -msse2 so that clang doesn't generate library calls. - Use cc-option to detect gcc 7.1 not supporting -mpreferred-stack-boundary=3 (amluto). - Document stuff so that we don't forget. - Fix: ld: lib/test_fpu.o: in function `test_fpu_get': >> test_fpu.c:(.text+0x16e): undefined reference to `__sanitizer_cov_trace_cmpd' >> ld: test_fpu.c:(.text+0x1a7): undefined reference to `__sanitizer_cov_trace_cmpd' ld: test_fpu.c:(.text+0x1e0): undefined reference to `__sanitizer_cov_trace_cmpd' ] Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Petteri Aimonen <jpa@git.mail.kapsi.fi> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lkml.kernel.org/r/20200624114646.28953-3-bp@alien8.de