summaryrefslogtreecommitdiffstats
path: root/arch/x86/mm (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'x86-iopl-for-linus' of ↵Linus Torvalds2019-11-261-0/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 iopl updates from Ingo Molnar: "This implements a nice simplification of the iopl and ioperm code that Thomas Gleixner discovered: we can implement the IO privilege features of the iopl system call by using the IO permission bitmap in permissive mode, while trapping CLI/STI/POPF/PUSHF uses in user-space if they change the interrupt flag. This implements that feature, with testing facilities and related cleanups" [ "Simplification" may be an over-statement. The main goal is to avoid the cli/sti of iopl by effectively implementing the IO port access parts of iopl in terms of ioperm. This may end up not workign well in case people actually depend on cli/sti being available, or if there are mixed uses of iopl and ioperm. We will see.. - Linus ] * 'x86-iopl-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) x86/ioperm: Fix use of deprecated config option x86/entry/32: Clarify register saving in __switch_to_asm() selftests/x86/iopl: Extend test to cover IOPL emulation x86/ioperm: Extend IOPL config to control ioperm() as well x86/iopl: Remove legacy IOPL option x86/iopl: Restrict iopl() permission scope x86/iopl: Fixup misleading comment selftests/x86/ioperm: Extend testing so the shared bitmap is exercised x86/ioperm: Share I/O bitmap if identical x86/ioperm: Remove bitmap if all permissions dropped x86/ioperm: Move TSS bitmap update to exit to user work x86/ioperm: Add bitmap sequence number x86/ioperm: Move iobitmap data into a struct x86/tss: Move I/O bitmap data into a seperate struct x86/io: Speedup schedule out of I/O bitmap user x86/ioperm: Avoid bitmap allocation if no permissions are set x86/ioperm: Simplify first ioperm() invocation logic x86/iopl: Cleanup include maze x86/tss: Fix and move VMX BUILD_BUG_ON() x86/cpu: Unify cpu_init() ...
| * x86/tss: Fix and move VMX BUILD_BUG_ON()Thomas Gleixner2019-11-161-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The BUILD_BUG_ON(IO_BITMAP_OFFSET - 1 == 0x67) in the VMX code is bogus in two aspects: 1) This wants to be in generic x86 code simply to catch issues even when VMX is disabled in Kconfig. 2) The IO_BITMAP_OFFSET is not the right thing to check because it makes asssumptions about the layout of tss_struct. Nothing requires that the I/O bitmap is placed right after x86_tss, which is the hardware mandated tss structure. It pointlessly makes restrictions on the struct tss_struct layout. The proper thing to check is: - Offset of x86_tss in tss_struct is 0 - Size of x86_tss == 0x68 Move it to the other build time TSS checks and make it do the right thing. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Andy Lutomirski <luto@kernel.org>
* | Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds2019-11-264-16/+18
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "The main changes in this cycle were: - Cross-arch changes to move the linker sections for NOTES and EXCEPTION_TABLE into the RO_DATA area, where they belong on most architectures. (Kees Cook) - Switch the x86 linker fill byte from x90 (NOP) to 0xcc (INT3), to trap jumps into the middle of those padding areas instead of sliding execution. (Kees Cook) - A thorough cleanup of symbol definitions within x86 assembler code. The rather randomly named macros got streamlined around a (hopefully) straightforward naming scheme: SYM_START(name, linkage, align...) SYM_END(name, sym_type) SYM_FUNC_START(name) SYM_FUNC_END(name) SYM_CODE_START(name) SYM_CODE_END(name) SYM_DATA_START(name) SYM_DATA_END(name) etc - with about three times of these basic primitives with some label, local symbol or attribute variant, expressed via postfixes. No change in functionality intended. (Jiri Slaby) - Misc other changes, cleanups and smaller fixes" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits) x86/entry/64: Remove pointless jump in paranoid_exit x86/entry/32: Remove unused resume_userspace label x86/build/vdso: Remove meaningless CFLAGS_REMOVE_*.o m68k: Convert missed RODATA to RO_DATA x86/vmlinux: Use INT3 instead of NOP for linker fill bytes x86/mm: Report actual image regions in /proc/iomem x86/mm: Report which part of kernel image is freed x86/mm: Remove redundant address-of operators on addresses xtensa: Move EXCEPTION_TABLE to RO_DATA segment powerpc: Move EXCEPTION_TABLE to RO_DATA segment parisc: Move EXCEPTION_TABLE to RO_DATA segment microblaze: Move EXCEPTION_TABLE to RO_DATA segment ia64: Move EXCEPTION_TABLE to RO_DATA segment h8300: Move EXCEPTION_TABLE to RO_DATA segment c6x: Move EXCEPTION_TABLE to RO_DATA segment arm64: Move EXCEPTION_TABLE to RO_DATA segment alpha: Move EXCEPTION_TABLE to RO_DATA segment x86/vmlinux: Move EXCEPTION_TABLE to RO_DATA segment x86/vmlinux: Actually use _etext for the end of the text segment vmlinux.lds.h: Allow EXCEPTION_TABLE to live in RO_DATA ...
| * \ Merge branch 'x86/build' into x86/asm, to pick up completed topic branchIngo Molnar2019-11-253-12/+14
| |\ \ | | | | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | x86/mm: Report which part of kernel image is freedKees Cook2019-11-042-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The memory freeing report wasn't very useful for figuring out which parts of the kernel image were being freed. Add the details for clearer reporting in dmesg. Before: Freeing unused kernel image memory: 1348K Write protecting the kernel read-only data: 20480k Freeing unused kernel image memory: 2040K Freeing unused kernel image memory: 172K After: Freeing unused kernel image (initmem) memory: 1348K Write protecting the kernel read-only data: 20480k Freeing unused kernel image (text/rodata gap) memory: 2040K Freeing unused kernel image (rodata/data gap) memory: 172K Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-28-keescook@chromium.org
| | * | x86/mm: Remove redundant address-of operators on addressesKees Cook2019-11-041-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The &s on addresses are redundant. Remove them to match all the other similar functions. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-27-keescook@chromium.org
| | * | x86/vmlinux: Actually use _etext for the end of the text segmentKees Cook2019-11-042-4/+4
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various calculations are using the end of the exception table (which does not need to be executable) as the end of the text segment. Instead, in preparation for moving the exception table into RO_DATA, move _etext after the exception table and update the calculations. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Ross Zwisler <zwisler@chromium.org> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Lendacky <Thomas.Lendacky@amd.com> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-16-keescook@chromium.org
| * / x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_*Jiri Slaby2019-10-181-4/+4
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These are all functions which are invoked from elsewhere, so annotate them as global using the new SYM_FUNC_START and their ENDPROC's by SYM_FUNC_END. Make sure ENTRY/ENDPROC is not defined on X86_64, given these were the last users. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [hibernate] Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> [xen bits] Acked-by: Herbert Xu <herbert@gondor.apana.org.au> [crypto] Cc: Allison Randal <allison@lohutok.net> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Armijn Hemel <armijn@tjaldur.nl> Cc: Cao jin <caoj.fnst@cn.fujitsu.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Enrico Weigelt <info@metux.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jim Mattson <jmattson@google.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Len Brown <len.brown@intel.com> Cc: linux-arch@vger.kernel.org Cc: linux-crypto@vger.kernel.org Cc: linux-efi <linux-efi@vger.kernel.org> Cc: linux-efi@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: platform-driver-x86@vger.kernel.org Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Wei Huang <wei@redhat.com> Cc: x86-ml <x86@kernel.org> Cc: xen-devel@lists.xenproject.org Cc: Xiaoyao Li <xiaoyao.li@linux.intel.com> Link: https://lkml.kernel.org/r/20191011115108.12392-25-jslaby@suse.cz
* | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2019-11-261-1/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "These are the fixes left over from the v5.4 cycle: - Various low level 32-bit entry code fixes and improvements by Andy Lutomirski, Peter Zijlstra and Thomas Gleixner. - Fix 32-bit Xen PV breakage, by Jan Beulich" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry/32: Fix FIXUP_ESPFIX_STACK with user CR3 x86/pti/32: Calculate the various PTI cpu_entry_area sizes correctly, make the CPU_ENTRY_AREA_PAGES assert precise selftests/x86/sigreturn/32: Invalidate DS and ES when abusing the kernel selftests/x86/mov_ss_trap: Fix the SYSENTER test x86/entry/32: Fix NMI vs ESPFIX x86/entry/32: Unwind the ESPFIX stack earlier on exception entry x86/entry/32: Move FIXUP_FRAME after pushing %fs in SAVE_ALL x86/entry/32: Use %ss segment where required x86/entry/32: Fix IRET exception x86/cpu_entry_area: Add guard page for entry stack on 32bit x86/pti/32: Size initial_page_table correctly x86/doublefault/32: Fix stack canaries in the double fault handler x86/xen/32: Simplify ring check in xen_iret_crit_fixup() x86/xen/32: Make xen_iret_crit_fixup() independent of frame layout x86/stackframe/32: Repair 32-bit Xen PV
| * | x86/pti/32: Calculate the various PTI cpu_entry_area sizes correctly, make ↵Ingo Molnar2019-11-251-1/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the CPU_ENTRY_AREA_PAGES assert precise When two recent commits that increased the size of the 'struct cpu_entry_area' were merged in -tip, the 32-bit defconfig build started failing on the following build time assert: ./include/linux/compiler.h:391:38: error: call to ‘__compiletime_assert_189’ declared with attribute error: BUILD_BUG_ON failed: CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE arch/x86/mm/cpu_entry_area.c:189:2: note: in expansion of macro ‘BUILD_BUG_ON’ In function ‘setup_cpu_entry_area_ptes’, Which corresponds to the following build time assert: BUILD_BUG_ON(CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE); The purpose of this assert is to sanity check the fixed-value definition of CPU_ENTRY_AREA_PAGES arch/x86/include/asm/pgtable_32_types.h: #define CPU_ENTRY_AREA_PAGES (NR_CPUS * 41) The '41' is supposed to match sizeof(struct cpu_entry_area)/PAGE_SIZE, which value we didn't want to define in such a low level header, because it would cause dependency hell. Every time the size of cpu_entry_area is changed, we have to adjust CPU_ENTRY_AREA_PAGES accordingly - and this assert is checking that constraint. But the assert is both imprecise and buggy, primarily because it doesn't include the single readonly IDT page that is mapped at CPU_ENTRY_AREA_BASE (which begins at a PMD boundary). This bug was hidden by the fact that by accident CPU_ENTRY_AREA_PAGES is defined too large upstream (v5.4-rc8): #define CPU_ENTRY_AREA_PAGES (NR_CPUS * 40) While 'struct cpu_entry_area' is 155648 bytes, or 38 pages. So we had two extra pages, which hid the bug. The following commit (not yet upstream) increased the size to 40 pages: x86/iopl: ("Restrict iopl() permission scope") ... but increased CPU_ENTRY_AREA_PAGES only 41 - i.e. shortening the gap to just 1 extra page. Then another not-yet-upstream commit changed the size again: 880a98c33996: ("x86/cpu_entry_area: Add guard page for entry stack on 32bit") Which increased the cpu_entry_area size from 38 to 39 pages, but didn't change CPU_ENTRY_AREA_PAGES (kept it at 40). This worked fine, because we still had a page left from the accidental 'reserve'. But when these two commits were merged into the same tree, the combined size of cpu_entry_area grew from 38 to 40 pages, while CPU_ENTRY_AREA_PAGES finally caught up to 40 as well. Which is fine in terms of functionality, but the assert broke: BUILD_BUG_ON(CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE); because CPU_ENTRY_AREA_MAP_SIZE is the total size of the area, which is 1 page larger due to the IDT page. To fix all this, change the assert to two precise asserts: BUILD_BUG_ON((CPU_ENTRY_AREA_PAGES+1)*PAGE_SIZE != CPU_ENTRY_AREA_MAP_SIZE); BUILD_BUG_ON(CPU_ENTRY_AREA_TOTAL_SIZE != CPU_ENTRY_AREA_MAP_SIZE); This takes the IDT page into account, and also connects the size-based define of CPU_ENTRY_AREA_TOTAL_SIZE with the address-subtraction based define of CPU_ENTRY_AREA_MAP_SIZE. Also clean up some of the names which made it rather confusing: - 'CPU_ENTRY_AREA_TOT_SIZE' wasn't actually the 'total' size of the cpu-entry-area, but the per-cpu array size, so rename this to CPU_ENTRY_AREA_ARRAY_SIZE. - Introduce CPU_ENTRY_AREA_TOTAL_SIZE that _is_ the total mapping size, with the IDT included. - Add comments where '+1' denotes the IDT mapping - it wasn't obvious and took me about 3 hours to decode... Finally, because this particular commit is actually applied after this patch: 880a98c33996: ("x86/cpu_entry_area: Add guard page for entry stack on 32bit") Fix the CPU_ENTRY_AREA_PAGES value from 40 pages to the correct 39 pages. All future commits that change cpu_entry_area will have to adjust this value precisely. As a side note, we should probably attempt to remove CPU_ENTRY_AREA_PAGES and derive its value directly from the structure, without causing header hell - but that is an adventure for another day! :-) Fixes: 880a98c33996: ("x86/cpu_entry_area: Add guard page for entry stack on 32bit") Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: stable@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2019-11-265-283/+200
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: "The main changes in this cycle were: - A PAT series from Davidlohr Bueso, which simplifies the memtype rbtree by using the interval tree helpers. (There's more cleanups in this area queued up, but they didn't make the merge window.) - Also flip over CONFIG_X86_5LEVEL to default-y. This might draw in a few more testers, as all the major distros are going to have 5-level paging enabled by default in their next iterations. - Misc cleanups" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/pat: Rename pat_rbtree.c to pat_interval.c x86/mm/pat: Drop the rbt_ prefix from external memtype calls x86/mm/pat: Do not pass 'rb_root' down the memtype tree helper functions x86/mm/pat: Convert the PAT tree to a generic interval tree x86/mm: Clean up the pmd_read_atomic() comments x86/mm: Fix function name typo in pmd_read_atomic() comment x86/cpu: Clean up intel_tlb_table[] x86/mm: Enable 5-level paging support by default
| * | x86/mm/pat: Rename pat_rbtree.c to pat_interval.cDavidlohr Bueso2019-11-212-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Considering the previous changes, this is a more proper name. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20191121011601.20611-5-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/mm/pat: Drop the rbt_ prefix from external memtype callsDavidlohr Bueso2019-11-213-20/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop the rbt_memtype_*() call rbt_ prefix, as we no longer use an rbtree directly. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20191121011601.20611-4-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/mm/pat: Do not pass 'rb_root' down the memtype tree helper functionsDavidlohr Bueso2019-11-211-13/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Get rid of the passing the rb_root down the helper calls; there is only one: &memtype_rbroot. No change in functionality. [ mingo: Fixed the changelog which described a different version of the patch. ] Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20191121011601.20611-3-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | x86/mm/pat: Convert the PAT tree to a generic interval treeDavidlohr Bueso2019-11-211-120/+42
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With some considerations, the custom pat_rbtree implementation can be simplified to use most of the generic interval_tree machinery: - The tree inorder traversal can slightly differ when there are key ('start') collisions in the tree due to one going left and another right. This, however, only affects the output of debugfs' pat_memtype_list file. - Generic interval trees are now fully closed [a, b], for which we need to adjust the last endpoint (ie: end - 1). - Erasing logic must remain untouched as well. - In order for the types to remain u64, the 'memtype_interval' calls are introduced, as opposed to simply using struct interval_tree. In addition, the PAT tree might potentially also benefit by the fast overlap detection for the insertion case when looking up the first overlapping node in the tree. No change in behavior is intended. Finally, I've tested this on various servers, via sanity warnings, running side by side with the current version and so far see no differences in the returned pointer node when doing memtype_rb_lowest_match() lookups. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20191121011601.20611-2-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge branch 'x86-entry-for-linus' of ↵Linus Torvalds2019-11-261-2/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 syscall entry updates from Ingo Molnar: "These changes relate to the preparatory cleanup of syscall function type signatures - to fix indirect call mismatches with Control-Flow Integrity (CFI) checking. No change in behavior intended" * 'x86-entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Use the correct function type for native_set_fixmap() syscalls/x86: Fix function types in COND_SYSCALL syscalls/x86: Use the correct function type for sys_ni_syscall syscalls/x86: Use COMPAT_SYSCALL_DEFINE0 for IA32 (rt_)sigreturn syscalls/x86: Wire up COMPAT_SYSCALL_DEFINE0 syscalls/x86: Use the correct function type in SYSCALL_DEFINE0
| * | x86/mm: Use the correct function type for native_set_fixmap()Sami Tolvanen2019-10-111-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We call native_set_fixmap indirectly through the function pointer struct pv_mmu_ops::set_fixmap, which expects the first parameter to be 'unsigned' instead of 'enum fixed_addresses'. This patch changes the function type for native_set_fixmap to match the pointer, which fixes indirect call mismatches with Control-Flow Integrity (CFI) checking. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H . Peter Anvin <hpa@zytor.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190913211402.193018-1-samitolvanen@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge branch 'x86-boot-for-linus' of ↵Linus Torvalds2019-11-261-0/+11
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: "The main changes were: - Extend the boot protocol to allow future extensions without hitting the setup_header size limit. - Add quirk to devicetree systems to disable the RTC unless it's listed as a supported device. - Fix ld.lld linker pedantry" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Introduce setup_indirect x86/boot: Introduce kernel_info.setup_type_max x86/boot: Introduce kernel_info x86/init: Allow DT configured systems to disable RTC at boot time x86/realmode: Explicitly set entry point via ENTRY in linker script
| * | x86/boot: Introduce setup_indirectDaniel Kiper2019-11-121-0/+11
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The setup_data is a bit awkward to use for extremely large data objects, both because the setup_data header has to be adjacent to the data object and because it has a 32-bit length field. However, it is important that intermediate stages of the boot process have a way to identify which chunks of memory are occupied by kernel data. Thus introduce an uniform way to specify such indirect data as setup_indirect struct and SETUP_INDIRECT type. And finally bump setup_header version in arch/x86/boot/header.S. Suggested-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Ross Philipson <ross.philipson@oracle.com> Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: ard.biesheuvel@linaro.org Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: dave.hansen@linux.intel.com Cc: eric.snowberg@oracle.com Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Juergen Gross <jgross@suse.com> Cc: kanth.ghatraju@oracle.com Cc: linux-doc@vger.kernel.org Cc: linux-efi <linux-efi@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: rdunlap@infradead.org Cc: ross.philipson@oracle.com Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Cc: xen-devel@lists.xenproject.org Link: https://lkml.kernel.org/r/20191112134640.16035-4-daniel.kiper@oracle.com
| |
| \
*-. \ Merge branches 'core-objtool-for-linus', 'x86-cleanups-for-linus' and ↵Linus Torvalds2019-11-261-1/+1
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 objtool, cleanup, and apic updates from Ingo Molnar: "Objtool: - Fix a gawk 5.0 incompatibility in gen-insn-attr-x86.awk. Most distros are still on gawk 4.2.x. Cleanup: - Misc cleanups, plus the removal of obsolete code such as Calgary IOMMU support, which code hasn't seen any real testing in a long time and there's no known users left. apic: - Two changes: a cleanup and a fix for an (old) race for oneshot threaded IRQ handlers" * 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/insn: Fix awk regexp warnings * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Remove unused asm/rio.h x86: Fix typos in comments x86/pci: Remove #ifdef __KERNEL__ guard from <asm/pci.h> x86/pci: Remove pci_64.h x86: Remove the calgary IOMMU driver x86/apic, x86/uprobes: Correct parameter names in kernel-doc comments x86/kdump: Remove the unused crash_copy_backup_region() x86/nmi: Remove stale EDAC include leftover * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ioapic: Rename misnamed functions x86/ioapic: Prevent inconsistent state when moving an interrupt
| | * x86: Fix typos in commentsCao jin2019-11-181-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BIOSen -> BIOSes; paing -> paging. Append to 640 its proper unit "Kb". encomapssing -> encompassing. [ bp: Merge into a single patch, fix one more typo, massage. ] Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Robert Richter <rrichter@marvell.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Lendacky <Thomas.Lendacky@amd.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191118070012.27850-1-caoj.fnst@cn.fujitsu.com
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds2019-11-262-1/+44
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: "Another merge window, another pull full of stuff: 1) Support alternative names for network devices, from Jiri Pirko. 2) Introduce per-netns netdev notifiers, also from Jiri Pirko. 3) Support MSG_PEEK in vsock/virtio, from Matias Ezequiel Vara Larsen. 4) Allow compiling out the TLS TOE code, from Jakub Kicinski. 5) Add several new tracepoints to the kTLS code, also from Jakub. 6) Support set channels ethtool callback in ena driver, from Sameeh Jubran. 7) New SCTP events SCTP_ADDR_ADDED, SCTP_ADDR_REMOVED, SCTP_ADDR_MADE_PRIM, and SCTP_SEND_FAILED_EVENT. From Xin Long. 8) Add XDP support to mvneta driver, from Lorenzo Bianconi. 9) Lots of netfilter hw offload fixes, cleanups and enhancements, from Pablo Neira Ayuso. 10) PTP support for aquantia chips, from Egor Pomozov. 11) Add UDP segmentation offload support to igb, ixgbe, and i40e. From Josh Hunt. 12) Add smart nagle to tipc, from Jon Maloy. 13) Support L2 field rewrite by TC offloads in bnxt_en, from Venkat Duvvuru. 14) Add a flow mask cache to OVS, from Tonghao Zhang. 15) Add XDP support to ice driver, from Maciej Fijalkowski. 16) Add AF_XDP support to ice driver, from Krzysztof Kazimierczak. 17) Support UDP GSO offload in atlantic driver, from Igor Russkikh. 18) Support it in stmmac driver too, from Jose Abreu. 19) Support TIPC encryption and auth, from Tuong Lien. 20) Introduce BPF trampolines, from Alexei Starovoitov. 21) Make page_pool API more numa friendly, from Saeed Mahameed. 22) Introduce route hints to ipv4 and ipv6, from Paolo Abeni. 23) Add UDP segmentation offload to cxgb4, Rahul Lakkireddy" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1857 commits) libbpf: Fix usage of u32 in userspace code mm: Implement no-MMU variant of vmalloc_user_node_flags slip: Fix use-after-free Read in slip_open net: dsa: sja1105: fix sja1105_parse_rgmii_delays() macvlan: schedule bc_work even if error enetc: add support Credit Based Shaper(CBS) for hardware offload net: phy: add helpers phy_(un)lock_mdio_bus mdio_bus: don't use managed reset-controller ax88179_178a: add ethtool_op_get_ts_info() mlxsw: spectrum_router: Fix use of uninitialized adjacency index mlxsw: spectrum_router: After underlay moves, demote conflicting tunnels bpf: Simplify __bpf_arch_text_poke poke type handling bpf: Introduce BPF_TRACE_x helper for the tracing tests bpf: Add bpf_jit_blinding_enabled for !CONFIG_BPF_JIT bpf, testing: Add various tail call test cases bpf, x86: Emit patchable direct jump as tail call bpf: Constant map key tracking for prog array pokes bpf: Add poke dependency tracking for prog array maps bpf: Add initial poke descriptor table for jit images bpf: Move owner type, jited info into array auxiliary data ...
| * | uaccess: Add strict non-pagefault kernel-space read functionDaniel Borkmann2019-11-022-1/+44
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add two new probe_kernel_read_strict() and strncpy_from_unsafe_strict() helpers which by default alias to the __probe_kernel_read() and the __strncpy_from_unsafe(), respectively, but can be overridden by archs which have non-overlapping address ranges for kernel space and user space in order to bail out with -EFAULT when attempting to probe user memory including non-canonical user access addresses [0]: 4-level page tables: user-space mem: 0x0000000000000000 - 0x00007fffffffffff non-canonical: 0x0000800000000000 - 0xffff7fffffffffff 5-level page tables: user-space mem: 0x0000000000000000 - 0x00ffffffffffffff non-canonical: 0x0100000000000000 - 0xfeffffffffffffff The idea is that these helpers are complementary to the probe_user_read() and strncpy_from_unsafe_user() which probe user-only memory. Both added helpers here do the same, but for kernel-only addresses. Both set of helpers are going to be used for BPF tracing. They also explicitly avoid throwing the splat for non-canonical user addresses from 00c42373d397 ("x86-64: add warning for non-canonical user access address dereferences"). For compat, the current probe_kernel_read() and strncpy_from_unsafe() are left as-is. [0] Documentation/x86/x86_64/mm.txt Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: x86@kernel.org Link: https://lore.kernel.org/bpf/eefeefd769aa5a013531f491a71f0936779e916b.1572649915.git.daniel@iogearbox.net
* | Merge tag 'printk-for-5.5' of ↵Linus Torvalds2019-11-264-12/+11
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk Pull printk updates from Petr Mladek: - Allow to print symbolic error names via new %pe modifier. - Use pr_warn() instead of the remaining pr_warning() calls. Fix formatting of the related lines. - Add VSPRINTF entry to MAINTAINERS. * tag 'printk-for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: (32 commits) checkpatch: don't warn about new vsprintf pointer extension '%pe' MAINTAINERS: Add VSPRINTF tools lib api: Renaming pr_warning to pr_warn ASoC: samsung: Use pr_warn instead of pr_warning lib: cpu_rmap: Use pr_warn instead of pr_warning trace: Use pr_warn instead of pr_warning dma-debug: Use pr_warn instead of pr_warning vgacon: Use pr_warn instead of pr_warning fs: afs: Use pr_warn instead of pr_warning sh/intc: Use pr_warn instead of pr_warning scsi: Use pr_warn instead of pr_warning platform/x86: intel_oaktrail: Use pr_warn instead of pr_warning platform/x86: asus-laptop: Use pr_warn instead of pr_warning platform/x86: eeepc-laptop: Use pr_warn instead of pr_warning oprofile: Use pr_warn instead of pr_warning of: Use pr_warn instead of pr_warning macintosh: Use pr_warn instead of pr_warning idsn: Use pr_warn instead of pr_warning ide: Use pr_warn instead of pr_warning crypto: n2: Use pr_warn instead of pr_warning ...
| * x86: Use pr_warn instead of pr_warningKefeng Wang2019-10-184-12/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As said in commit f2c2cbcc35d4 ("powerpc: Use pr_warn instead of pr_warning"), removing pr_warning so all logging messages use a consistent <prefix>_warn style. Let's do it. Link: http://lkml.kernel.org/r/20191018031850.48498-7-wangkefeng.wang@huawei.com To: linux-kernel@vger.kernel.org Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Robert Richter <rric@kernel.org> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
* | Merge branch 'next-lockdown' of ↵Linus Torvalds2019-09-281-0/+5
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull kernel lockdown mode from James Morris: "This is the latest iteration of the kernel lockdown patchset, from Matthew Garrett, David Howells and others. From the original description: This patchset introduces an optional kernel lockdown feature, intended to strengthen the boundary between UID 0 and the kernel. When enabled, various pieces of kernel functionality are restricted. Applications that rely on low-level access to either hardware or the kernel may cease working as a result - therefore this should not be enabled without appropriate evaluation beforehand. The majority of mainstream distributions have been carrying variants of this patchset for many years now, so there's value in providing a doesn't meet every distribution requirement, but gets us much closer to not requiring external patches. There are two major changes since this was last proposed for mainline: - Separating lockdown from EFI secure boot. Background discussion is covered here: https://lwn.net/Articles/751061/ - Implementation as an LSM, with a default stackable lockdown LSM module. This allows the lockdown feature to be policy-driven, rather than encoding an implicit policy within the mechanism. The new locked_down LSM hook is provided to allow LSMs to make a policy decision around whether kernel functionality that would allow tampering with or examining the runtime state of the kernel should be permitted. The included lockdown LSM provides an implementation with a simple policy intended for general purpose use. This policy provides a coarse level of granularity, controllable via the kernel command line: lockdown={integrity|confidentiality} Enable the kernel lockdown feature. If set to integrity, kernel features that allow userland to modify the running kernel are disabled. If set to confidentiality, kernel features that allow userland to extract confidential information from the kernel are also disabled. This may also be controlled via /sys/kernel/security/lockdown and overriden by kernel configuration. New or existing LSMs may implement finer-grained controls of the lockdown features. Refer to the lockdown_reason documentation in include/linux/security.h for details. The lockdown feature has had signficant design feedback and review across many subsystems. This code has been in linux-next for some weeks, with a few fixes applied along the way. Stephen Rothwell noted that commit 9d1f8be5cf42 ("bpf: Restrict bpf when kernel lockdown is in confidentiality mode") is missing a Signed-off-by from its author. Matthew responded that he is providing this under category (c) of the DCO" * 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (31 commits) kexec: Fix file verification on S390 security: constify some arrays in lockdown LSM lockdown: Print current->comm in restriction messages efi: Restrict efivar_ssdt_load when the kernel is locked down tracefs: Restrict tracefs when the kernel is locked down debugfs: Restrict debugfs when the kernel is locked down kexec: Allow kexec_file() with appropriate IMA policy when locked down lockdown: Lock down perf when in confidentiality mode bpf: Restrict bpf when kernel lockdown is in confidentiality mode lockdown: Lock down tracing and perf kprobes when in confidentiality mode lockdown: Lock down /proc/kcore x86/mmiotrace: Lock down the testmmiotrace module lockdown: Lock down module params that specify hardware parameters (eg. ioport) lockdown: Lock down TIOCSSERIAL lockdown: Prohibit PCMCIA CIS storage when the kernel is locked down acpi: Disable ACPI table override if the kernel is locked down acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down ACPI: Limit access to custom_method when the kernel is locked down x86/msr: Restrict MSR access when the kernel is locked down x86: Lock down IO port access when the kernel is locked down ...
| * | x86/mmiotrace: Lock down the testmmiotrace moduleDavid Howells2019-08-201-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The testmmiotrace module shouldn't be permitted when the kernel is locked down as it can be used to arbitrarily read and write MMIO space. This is a runtime check rather than buildtime in order to allow configurations where the same kernel may be run in both locked down or permissive modes depending on local policy. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Howells <dhowells@redhat.com Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Kees Cook <keescook@chromium.org> cc: Thomas Gleixner <tglx@linutronix.de> cc: Steven Rostedt <rostedt@goodmis.org> cc: Ingo Molnar <mingo@kernel.org> cc: "H. Peter Anvin" <hpa@zytor.com> cc: x86@kernel.org Signed-off-by: James Morris <jmorris@namei.org>
* | | mm: treewide: clarify pgtable_page_{ctor,dtor}() namingMark Rutland2019-09-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The naming of pgtable_page_{ctor,dtor}() seems to have confused a few people, and until recently arm64 used these erroneously/pointlessly for other levels of page table. To make it incredibly clear that these only apply to the PTE level, and to align with the naming of pgtable_pmd_page_{ctor,dtor}(), let's rename them to pgtable_pte_page_{ctor,dtor}(). These changes were generated with the following shell script: ---- git grep -lw 'pgtable_page_.tor' | while read FILE; do sed -i '{s/pgtable_page_ctor/pgtable_pte_page_ctor/}' $FILE; sed -i '{s/pgtable_page_dtor/pgtable_pte_page_dtor/}' $FILE; done ---- ... with the documentation re-flowed to remain under 80 columns, and whitespace fixed up in macros to keep backslashes aligned. There should be no functional change as a result of this patch. Link: http://lkml.kernel.org/r/20190722141133.3116-1-mark.rutland@arm.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | augmented rbtree: add new RB_DECLARE_CALLBACKS_MAX macroMichel Lespinasse2019-09-261-16/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add RB_DECLARE_CALLBACKS_MAX, which generates augmented rbtree callbacks for the case where the augmented value is a scalar whose definition follows a max(f(node)) pattern. This actually covers all present uses of RB_DECLARE_CALLBACKS, and saves some (source) code duplication in the various RBCOMPUTE function definitions. [walken@google.com: fix mm/vmalloc.c] Link: http://lkml.kernel.org/r/CANN689FXgK13wDYNh1zKxdipeTuALG4eKvKpsdZqKFJ-rvtGiQ@mail.gmail.com [walken@google.com: re-add check to check_augmented()] Link: http://lkml.kernel.org/r/20190727022027.GA86863@google.com Link: http://lkml.kernel.org/r/20190703040156.56953-3-walken@google.com Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Uladzislau Rezki <urezki@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | mm: consolidate pgtable_cache_init() and pgd_cache_init()Mike Rapoport2019-09-251-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both pgtable_cache_init() and pgd_cache_init() are used to initialize kmem cache for page table allocations on several architectures that do not use PAGE_SIZE tables for one or more levels of the page table hierarchy. Most architectures do not implement these functions and use __weak default NOP implementation of pgd_cache_init(). Since there is no such default for pgtable_cache_init(), its empty stub is duplicated among most architectures. Rename the definitions of pgd_cache_init() to pgtable_cache_init() and drop empty stubs of pgtable_cache_init(). Link: http://lkml.kernel.org/r/1566457046-22637-1-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Will Deacon <will@kernel.org> [arm64] Acked-by: Thomas Gleixner <tglx@linutronix.de> [x86] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge tag 'powerpc-5.4-1' of ↵Linus Torvalds2019-09-201-2/+0
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "This is a bit late, partly due to me travelling, and partly due to a power outage knocking out some of my test systems *while* I was travelling. - Initial support for running on a system with an Ultravisor, which is software that runs below the hypervisor and protects guests against some attacks by the hypervisor. - Support for building the kernel to run as a "Secure Virtual Machine", ie. as a guest capable of running on a system with an Ultravisor. - Some changes to our DMA code on bare metal, to allow devices with medium sized DMA masks (> 32 && < 59 bits) to use more than 2GB of DMA space. - Support for firmware assisted crash dumps on bare metal (powernv). - Two series fixing bugs in and refactoring our PCI EEH code. - A large series refactoring our exception entry code to use gas macros, both to make it more readable and also enable some future optimisations. As well as many cleanups and other minor features & fixups. Thanks to: Adam Zerella, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anshuman Khandual, Balbir Singh, Benjamin Herrenschmidt, Cédric Le Goater, Christophe JAILLET, Christophe Leroy, Christopher M. Riedl, Christoph Hellwig, Claudio Carvalho, Daniel Axtens, David Gibson, David Hildenbrand, Desnes A. Nunes do Rosario, Ganesh Goudar, Gautham R. Shenoy, Greg Kurz, Guerney Hunt, Gustavo Romero, Halil Pasic, Hari Bathini, Joakim Tjernlund, Jonathan Neuschafer, Jordan Niethe, Leonardo Bras, Lianbo Jiang, Madhavan Srinivasan, Mahesh Salgaonkar, Mahesh Salgaonkar, Masahiro Yamada, Maxiwell S. Garcia, Michael Anderson, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Qian Cai, Ram Pai, Ravi Bangoria, Reza Arbab, Ryan Grimm, Sam Bobroff, Santosh Sivaraj, Segher Boessenkool, Sukadev Bhattiprolu, Thiago Bauermann, Thiago Jung Bauermann, Thomas Gleixner, Tom Lendacky, Vasant Hegde" * tag 'powerpc-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (264 commits) powerpc/mm/mce: Keep irqs disabled during lockless page table walk powerpc: Use ftrace_graph_ret_addr() when unwinding powerpc/ftrace: Enable HAVE_FUNCTION_GRAPH_RET_ADDR_PTR ftrace: Look up the address of return_to_handler() using helpers powerpc: dump kernel log before carrying out fadump or kdump docs: powerpc: Add missing documentation reference powerpc/xmon: Fix output of XIVE IPI powerpc/xmon: Improve output of XIVE interrupts powerpc/mm/radix: remove useless kernel messages powerpc/fadump: support holes in kernel boot memory area powerpc/fadump: remove RMA_START and RMA_END macros powerpc/fadump: update documentation about option to release opalcore powerpc/fadump: consider f/w load area powerpc/opalcore: provide an option to invalidate /sys/firmware/opal/core file powerpc/opalcore: export /sys/firmware/opal/core for analysing opal crashes powerpc/fadump: update documentation about CONFIG_PRESERVE_FA_DUMP powerpc/fadump: add support to preserve crash data on FADUMP disabled kernel powerpc/fadump: improve how crashed kernel's memory is reserved powerpc/fadump: consider reserved ranges while releasing memory powerpc/fadump: make crash memory ranges array allocation generic ...
| * | | fs/core/vmcore: Move sev_active() reference to x86 arch codeThiago Jung Bauermann2019-08-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Secure Encrypted Virtualization is an x86-specific feature, so it shouldn't appear in generic kernel code because it forces non-x86 architectures to define the sev_active() function, which doesn't make a lot of sense. To solve this problem, add an x86 elfcorehdr_read() function to override the generic weak implementation. To do that, it's necessary to make read_from_oldmem() public so that it can be used outside of vmcore.c. Also, remove the export for sev_active() since it's only used in files that won't be built as modules. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Lianbo Jiang <lijiang@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190806044919.10622-6-bauerman@linux.ibm.com
| * | | x86, s390/mm: Move sme_active() and sme_me_mask to x86-specific headerThiago Jung Bauermann2019-08-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that generic code doesn't reference them, move sme_active() and sme_me_mask to x86's <asm/mem_encrypt.h>. Also remove the export for sme_active() since it's only used in files that won't be built as modules. sme_me_mask on the other hand is used in arch/x86/kvm/svm.c (via __sme_set() and __psp_pa()) which can be built as a module so its export needs to stay. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190806044919.10622-5-bauerman@linux.ibm.com
* | | | Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2019-09-171-2/+6
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 pti updates from Thomas Gleixner: "Two small PTI updates: - Handle unaligned addresses gracefully in pti_clone_pagetable(). Not an issue with current callers, but a correctness problem. Adds a warning so any caller which hands in an unaligned address gets pointed out clearly. - Prevent PTI functions from being invoked when PTI is disabled at boottime. While this does not cause any harm today, it's pointless code executed and prone to cause subtle issues if the PTI implementation changes internally over time" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/pti: Do not invoke PTI functions when PTI is disabled x86/mm/pti: Handle unaligned address gracefully in pti_clone_pagetable()
| * | | | x86/mm/pti: Do not invoke PTI functions when PTI is disabledThomas Gleixner2019-08-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When PTI is disabled at boot time either because the CPU is not affected or PTI has been disabled on the command line, the boot code still calls into pti_finalize() which then unconditionally invokes: pti_clone_entry_text() pti_clone_kernel_text() pti_clone_kernel_text() was called unconditionally before the 32bit support was added and 32bit added the call to pti_clone_entry_text(). The call has no side effects as cloning the page tables into the available second one, which was allocated for PTI does not create damage. But it does not make sense either and in case that this functionality would be extended later this might actually lead to hard to diagnose issues. Neither function should be called when PTI is runtime disabled. Make the invocation conditional. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190828143124.063353972@linutronix.de
| * | | | x86/mm/pti: Handle unaligned address gracefully in pti_clone_pagetable()Song Liu2019-08-291-2/+4
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pti_clone_pmds() assumes that the supplied address is either: - properly PUD/PMD aligned or - the address is actually mapped which means that independently of the mapping level (PUD/PMD/PTE) the next higher mapping exists. If that's not the case the unaligned address can be incremented by PUD or PMD size incorrectly. All callers supply mapped and/or aligned addresses, but for the sake of robustness it's better to handle that case properly and to emit a warning. [ tglx: Rewrote changelog and added WARN_ON_ONCE() ] Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908282352470.1938@nanos.tec.linutronix.de
* | | | Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2019-09-174-114/+4
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: - Make cpumask_of_node() more robust against invalid node IDs - Simplify and speed up load_mm_cr4() - Unexport and remove various unused set_memory_*() APIs - Misc cleanups * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Fix cpumask_of_node() error condition x86/mm: Remove the unused set_memory_wt() function x86/mm: Remove set_pages_x() and set_pages_nx() x86/mm: Remove the unused set_memory_array_*() functions x86/mm: Unexport set_memory_x() and set_memory_nx() x86/fixmap: Cleanup outdated comments x86/kconfig: Remove X86_DIRECT_GBPAGES dependency on !DEBUG_PAGEALLOC x86/mm: Avoid redundant interrupt disable in load_mm_cr4()
| * | | | x86/mm: Fix cpumask_of_node() error conditionPeter Zijlstra2019-09-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_DEBUG_PER_CPU_MAPS=y we validate that the @node argument of cpumask_of_node() is a valid node_id. It however forgets to check for negative numbers. Fix this by explicitly casting to unsigned int. (unsigned)node >= nr_node_ids verifies: 0 <= node < nr_node_ids Also ammend the error message to match the condition. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Yunsheng Lin <linyunsheng@huawei.com> Link: https://lkml.kernel.org/r/20190903075352.GY2369@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | x86/mm: Remove the unused set_memory_wt() functionChristoph Hellwig2019-09-031-17/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190826075558.8125-5-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | x86/mm: Remove set_pages_x() and set_pages_nx()Christoph Hellwig2019-09-032-17/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These wrappers don't provide a real benefit over just using set_memory_x() and set_memory_nx(). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190826075558.8125-4-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | x86/mm: Remove the unused set_memory_array_*() functionsChristoph Hellwig2019-09-031-75/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190826075558.8125-3-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | x86/mm: Unexport set_memory_x() and set_memory_nx()Christoph Hellwig2019-09-031-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No module currently messed with clearing or setting the execute permission of kernel memory, and none really should. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190826075558.8125-2-hch@lst.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | Merge tag 'v5.3-rc7' into x86/mm, to pick up fixesIngo Molnar2019-09-032-17/+24
| |\ \ \ \ | | | | | | | | | | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/mm: Avoid redundant interrupt disable in load_mm_cr4()Jan Kiszka2019-07-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | load_mm_cr4() is always called with interrupts disabled from: - switch_mm_irqs_off() - refresh_pce(), which is a on_each_cpu() callback Thus, disabling interrupts in cr4_set/clear_bits() is redundant. Implement cr4_set/clear_bits_irqsoff() helpers, rename load_mm_cr4() to load_mm_cr4_irqsoff() and use the new helpers. The new helpers do not need a lockdep assert as __cr4_set() has one already. The renaming in combination with the checks in __cr4_set() ensure that any changes in the boundary conditions at the call sites will be detected. [ tglx: Massaged change log ] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/0fbbcb64-5f26-4ffb-1bb9-4f5f48426893@siemens.com
* | | | | | Merge branch 'efi-core-for-linus' of ↵Linus Torvalds2019-09-171-0/+1
|\ \ \ \ \ \ | |_|/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: - refactor the EFI config table handling across architectures - add support for the Dell EMC OEM config table - include AER diagnostic output to CPER handling of fatal PCIe errors * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: cper: print AER info of PCIe fatal error efi: Export Runtime Configuration Interface table to sysfs efi: ia64: move SAL systab handling out of generic EFI code efi/x86: move UV_SYSTAB handling into arch/x86 efi: x86: move efi_is_table_address() into arch/x86
| * | | | | efi: x86: move efi_is_table_address() into arch/x86Ard Biesheuvel2019-08-081-0/+1
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function efi_is_table_address() and the associated array of table pointers is specific to x86. Since we will be adding some more x86 specific tables, let's move this code out of the generic code first. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
* | / / / x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel textThomas Gleixner2019-08-291-8/+18
| |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ftrace does not use text_poke() for enabling trace functionality. It uses its own mechanism and flips the whole kernel text to RW and back to RO. The CPA rework removed a loop based check of 4k pages which tried to preserve a large page by checking each 4k page whether the change would actually cover all pages in the large page. This resulted in endless loops for nothing as in testing it turned out that it actually never preserved anything. Of course testing missed to include ftrace, which is the one and only case which benefitted from the 4k loop. As a consequence enabling function tracing or ftrace based kprobes results in a full 4k split of the kernel text, which affects iTLB performance. The kernel RO protection is the only valid case where this can actually preserve large pages. All other static protections (RO data, data NX, PCI, BIOS) are truly static. So a conflict with those protections which results in a split should only ever happen when a change of memory next to a protected region is attempted. But these conflicts are rightfully splitting the large page to preserve the protected regions. In fact a change to the protected regions itself is a bug and is warned about. Add an exception for the static protection check for kernel text RO when the to be changed region spawns a full large page which allows to preserve the large mappings. This also prevents the syslog to be spammed about CPA violations when ftrace is used. The exception needs to be removed once ftrace switched over to text_poke() which avoids the whole issue. Fixes: 585948f4f695 ("x86/mm/cpa: Avoid the 4k pages check completely") Reported-by: Song Liu <songliubraving@fb.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Song Liu <songliubraving@fb.com> Reviewed-by: Song Liu <songliubraving@fb.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908282355340.1938@nanos.tec.linutronix.de
* | | | x86/mm: Sync also unmappings in vmalloc_sync_all()Joerg Roedel2019-07-221-8/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With huge-page ioremap areas the unmappings also need to be synced between all page-tables. Otherwise it can cause data corruption when a region is unmapped and later re-used. Make the vmalloc_sync_one() function ready to sync unmappings and make sure vmalloc_sync_all() iterates over all page-tables even when an unmapped PMD is found. Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20190719184652.11391-3-joro@8bytes.org
* | | | x86/mm: Check for pfn instead of page in vmalloc_sync_one()Joerg Roedel2019-07-221-1/+1
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not require a struct page for the mapped memory location because it might not exist. This can happen when an ioremapped region is mapped with 2MB pages. Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20190719184652.11391-2-joro@8bytes.org
* | | Merge tag 'dma-mapping-5.3-1' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds2019-07-201-0/+30
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull dma-mapping fixes from Christoph Hellwig: "Fix various regressions: - force unencrypted dma-coherent buffers if encryption bit can't fit into the dma coherent mask (Tom Lendacky) - avoid limiting request size if swiotlb is not used (me) - fix swiotlb handling in dma_direct_sync_sg_for_cpu/device (Fugang Duan)" * tag 'dma-mapping-5.3-1' of git://git.infradead.org/users/hch/dma-mapping: dma-direct: correct the physical addr in dma_direct_sync_sg_for_cpu/device dma-direct: only limit the mapping size if swiotlb could be used dma-mapping: add a dma_addressing_limited helper dma-direct: Force unencrypted DMA under SME for certain DMA masks