summaryrefslogtreecommitdiffstats
path: root/arch/s390/kernel/vdso.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* mm: remove include/linux/bootmem.hMike Rapoport2018-10-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move remaining definitions and declarations from include/linux/bootmem.h into include/linux/memblock.h and remove the redundant header. The includes were replaced with the semantic patch below and then semi-automated removal of duplicated '#include <linux/memblock.h> @@ @@ - #include <linux/bootmem.h> + #include <linux/memblock.h> [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal] Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* s390/vdso: avoid 64-bit vdso mapping for compat tasksVasily Gorbik2018-09-201-3/+5
| | | | | | | | | | | | | | | | | vdso_fault used is_compat_task function (on s390 it tests "current" thread_info flags) to distinguish compat tasks and map 31-bit vdso pages. But "current" task might not correspond to mm context. When 31-bit compat inferior is executed under gdb, gdb does PTRACE_PEEKTEXT on vdso page, causing vdso_fault with "current" being 64-bit gdb process. So, 31-bit inferior ends up with 64-bit vdso mapped. To avoid this problem a new compat_mm flag has been introduced into mm context. This flag is used in vdso_fault and vdso_mremap instead of is_compat_task. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: vdso_fault return typeSouptick Joarder2018-07-231-1/+1
| | | | | | | | Use new return type vm_fault_t for fault handler vdso_fault. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* treewide: kzalloc() -> kcalloc()Kees Cook2018-06-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(char) * COUNT + COUNT , ...) | kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) | kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) | kzalloc(sizeof(TYPE) * C2, ...) | kzalloc(C1 * C2 * C3, ...) | kzalloc(C1 * C2, ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) | - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) | - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
* s390: kernel: Remove redundant license textGreg Kroah-Hartman2017-11-241-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | Now that the SPDX tag is in all arch/s390/kernel/ files, that identifies the license in a specific and legally-defined manner. So the extra GPL text wording can be removed as it is no longer needed at all. This is done on a quest to remove the 700+ different ways that files in the kernel describe the GPL license text. And there's unneeded stuff like the address (sometimes incorrect) for the FSF which is never needed. No copyright headers or other non-license-description text was removed. Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: kernel: add SPDX identifiers to the remaining filesGreg Kroah-Hartman2017-11-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | It's good to have SPDX identifiers in all files to make it easier to audit the kernel tree for correct licenses. Update the arch/s390/kernel/ files with the correct SPDX license identifier based on the license text in the file itself. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This work is based on a script and data from Thomas Gleixner, Philippe Ombredanne, and Kate Stewart. Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: remove all code using the access register modeMartin Schwidefsky2017-11-141-40/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The vdso code for the getcpu() and the clock_gettime() call use the access register mode to access the per-CPU vdso data page with the current code. An alternative to the complicated AR mode is to use the secondary space mode. This makes the vdso faster and quite a bit simpler. The downside is that the uaccess code has to be changed quite a bit. Which instructions are used depends on the machine and what kind of uaccess operation is requested. The instruction dictates which ASCE value needs to be loaded into %cr1 and %cr7. The different cases: * User copy with MVCOS for z10 and newer machines The MVCOS instruction can copy between the primary space (aka user) and the home space (aka kernel) directly. For set_fs(KERNEL_DS) the kernel ASCE is loaded into %cr1. For set_fs(USER_DS) the user space is already loaded in %cr1. * User copy with MVCP/MVCS for older machines To be able to execute the MVCP/MVCS instructions the kernel needs to switch to primary mode. The control register %cr1 has to be set to the kernel ASCE and %cr7 to either the kernel ASCE or the user ASCE dependent on set_fs(KERNEL_DS) vs set_fs(USER_DS). * Data access in the user address space for strnlen / futex To use "normal" instruction with data from the user address space the secondary space mode is used. The kernel needs to switch to primary mode, %cr1 has to contain the kernel ASCE and %cr7 either the user ASCE or the kernel ASCE, dependent on set_fs. To load a new value into %cr1 or %cr7 is an expensive operation, the kernel tries to be lazy about it. E.g. for multiple user copies in a row with MVCP/MVCS the replacement of the vdso ASCE in %cr7 with the user ASCE is done only once. On return to user space a CPU bit is checked that loads the vdso ASCE again. To enable and disable the data access via the secondary space two new functions are added, enable_sacf_uaccess and disable_sacf_uaccess. The fact that a context is in secondary space uaccess mode is stored in the mm_segment_t value for the task. The code of an interrupt may use set_fs as long as it returns to the previous state it got with get_fs with another call to set_fs. The code in finish_arch_post_lock_switch simply has to do a set_fs with the current mm_segment_t value for the task. For CPUs with MVCOS: CPU running in | %cr1 ASCE | %cr7 ASCE | --------------------------------------|-----------|-----------| user space | user | vdso | kernel, USER_DS, normal-mode | user | vdso | kernel, USER_DS, normal-mode, lazy | user | user | kernel, USER_DS, sacf-mode | kernel | user | kernel, KERNEL_DS, normal-mode | kernel | vdso | kernel, KERNEL_DS, normal-mode, lazy | kernel | kernel | kernel, KERNEL_DS, sacf-mode | kernel | kernel | For CPUs without MVCOS: CPU running in | %cr1 ASCE | %cr7 ASCE | --------------------------------------|-----------|-----------| user space | user | vdso | kernel, USER_DS, normal-mode | user | vdso | kernel, USER_DS, normal-mode lazy | kernel | user | kernel, USER_DS, sacf-mode | kernel | user | kernel, KERNEL_DS, normal-mode | kernel | vdso | kernel, KERNEL_DS, normal-mode, lazy | kernel | kernel | kernel, KERNEL_DS, sacf-mode | kernel | kernel | The lines with "lazy" refer to the state after a copy via the secondary space with a delayed reload of %cr1 and %cr7. There are three hardware address spaces that can cause a DAT exception, primary, secondary and home space. The exception can be related to four different fault types: user space fault, vdso fault, kernel fault, and the gmap faults. Dependent on the set_fs state and normal vs. sacf mode there are a number of fault combinations: 1) user address space fault via the primary ASCE 2) gmap address space fault via the primary ASCE 3) kernel address space fault via the primary ASCE for machines with MVCOS and set_fs(KERNEL_DS) 4) vdso address space faults via the secondary ASCE with an invalid address while running in secondary space in problem state 5) user address space fault via the secondary ASCE for user-copy based on the secondary space mode, e.g. futex_ops or strnlen_user 6) kernel address space fault via the secondary ASCE for user-copy with secondary space mode with set_fs(KERNEL_DS) 7) kernel address space fault via the primary ASCE for user-copy with secondary space mode with set_fs(USER_DS) on machines without MVCOS. 8) kernel address space fault via the home space ASCE Replace user_space_fault() with a new function get_fault_type() that can distinguish all four different fault types. With these changes the futex atomic ops from the kernel and the strnlen_user will get a little bit slower, as well as the old style uaccess with MVCP/MVCS. All user accesses based on MVCOS will be as fast as before. On the positive side, the user space vdso code is a lot faster and Linux ceases to use the complicated AR mode. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* s390/vdso: move boot_vdso_data to vdso.cMartin Schwidefsky2017-10-181-0/+14
| | | | | | | | The boot_vdso_data variable is related to the vdso code, the magic of the initial vdso area for the early boot and the replacement of it in vdso_init should all be put into vdso.c. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/mm: use memset64 instead of clear_tableHeiko Carstens2017-10-091-4/+2
| | | | | | | | Use memset64 instead of the (now) open-coded variant clear_table. Performance wise there is no difference. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/mm: tag normal pages vs pages used in page tablesMartin Schwidefsky2017-07-251-0/+2
| | | | | | | | | | | The ESSA instruction has a new option that allows to tag pages that are not used as a page table. Without the tag the hypervisor has to assume that any guest page could be used in a page table inside the guest. This forces the hypervisor to flush all guest TLB entries whenever a host page table entry is invalidated. With the tag the host can skip the TLB flush if the page is tagged as normal page. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/vdso: use _install_special_mapping to establish vdsoMartin Schwidefsky2017-06-121-27/+64
| | | | | | | | | | Switch to the improved _install_special_mapping function to install the vdso mapping. This has two advantages, the arch_vma_name function is not needed anymore and the vdso vma still has its name after its memory location has been changed with mremap. Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: kernel: Audit and remove any unnecessary uses of module.hPaul Gortmaker2017-02-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each change instance for the presence of either and replace as needed. Build testing revealed some implicit header usage that was fixed up accordingly. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* vdso: make arch_setup_additional_pages wait for mmap_sem for write killableMichal Hocko2016-05-241-1/+2
| | | | | | | | | | | | | | | most architectures are relying on mmap_sem for write in their arch_setup_additional_pages. If the waiting task gets killed by the oom killer it would block oom_reaper from asynchronous address space reclaim and reduce the chances of timely OOM resolving. Wait for the lock in the killable mode and return with EINTR if the task got killed while waiting. Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Andy Lutomirski <luto@amacapital.net> [x86 vdso] Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* s390/vdso: optimize getcpu system callMartin Schwidefsky2016-01-111-1/+8
| | | | | | | | | | Add the CPU number to the per-cpu vdso data page and add the __kernel_getcpu function to the vdso object to retrieve the CPU number in user space. Suggested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: drop smp_mb in vdso_initMichael S. Tsirkin2016-01-111-2/+0
| | | | | | | | | | | The initial s390 vdso code is heavily influenced by the powerpc version which does have a smp_wmb in vdso_init right before the vdso_ready=1 assignment. s390 has no need for that. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <1452010645-25380-1-git-send-email-mst@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: rename struct _lowcore to struct lowcoreHeiko Carstens2016-01-111-3/+3
| | | | | | | | | | | | | Finally get rid of the leading underscore. I tried this already two or three years ago, however Michael Holzheu objected since this would break the crash utility (again). However Michael integrated support for the new name into the crash utility back then, so it doesn't break if the name will be changed now. So finally get rid of the ever confusing leading underscore. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/vdso: use correct memory barrierChristian Borntraeger2015-10-141-1/+1
| | | | | | | | | | | | By definition smp_wmb only orders writes against writes. (Finish all previous writes, and do not start any future write). To protect the vdso init code against early reads on other CPUs, let's use a full smp_mb at the end of vdso init. As right now smp_wmb is implemented as full serialization, this needs no stable backport, but this change will be necessary if we reimplement smp_wmb. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: remove 31 bit supportHeiko Carstens2015-03-251-14/+2
| | | | | | | | | | | | | | | | | | | Remove the 31 bit support in order to reduce maintenance cost and effectively remove dead code. Since a couple of years there is no distribution left that comes with a 31 bit kernel. The 31 bit kernel also has been broken since more than a year before anybody noticed. In addition I added a removal warning to the kernel shown at ipl for 5 minutes: a960062e5826 ("s390: add 31 bit warning message") which let everybody know about the plan to remove 31 bit code. We didn't get any response. Given that the last 31 bit only machine was introduced in 1999 let's remove the code. Anybody with 31 bit user space code can still use the compat mode. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* arm64,ia64,ppc,s390,sh,tile,um,x86,mm: remove default gate areaAndy Lutomirski2014-08-091-15/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The core mm code will provide a default gate area based on FIXADDR_USER_START and FIXADDR_USER_END if !defined(__HAVE_ARCH_GATE_AREA) && defined(AT_SYSINFO_EHDR). This default is only useful for ia64. arm64, ppc, s390, sh, tile, 64-bit UML, and x86_32 have their own code just to disable it. arm, 32-bit UML, and x86_64 have gate areas, but they have their own implementations. This gets rid of the default and moves the code into ia64. This should save some code on architectures without a gate area: it's now possible to inline the gate_area functions in the default case. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Nathan Lynch <nathan_lynch@mentor.com> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [in principle] Acked-by: Richard Weinberger <richard@nod.at> [for um] Acked-by: Will Deacon <will.deacon@arm.com> [for arm64] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Nathan Lynch <Nathan_Lynch@mentor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* s390/vdso: fix access-list entry initializationHeiko Carstens2013-12-021-1/+1
| | | | | | | | | | | The access-list entry is supposed to have the fetch-only bit set, however a reserved bit got set instead. Userspace isn't able to write to the page anyway since the accessed page has the read-only bit set. So this saves us only for bad surprises in the future if the reserved bit gets used. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/uaccess: always run the kernel in home spaceMartin Schwidefsky2013-10-241-5/+4
| | | | | | | Simplify the uaccess code by removing the user_mode=home option. The kernel will now always run in the home space mode. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/mm: cleanup page table definitionsMartin Schwidefsky2013-08-221-2/+2
| | | | | | | | | | | | | Improve the encoding of the different pte types and the naming of the page, segment table and region table bits. Due to the different pte encoding the hugetlbfs primitives need to be adapted as well. To improve compatability with common code make the huge ptes use the encoding of normal ptes. The conversion between the pte and pmd encoding for a huge pte is done with set_huge_pte_at and huge_ptep_get. Overall the code is now easier to understand. Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390: replace remaining strict_strtoul() with kstrtoul()Heiko Carstens2013-08-221-1/+1
| | | | | | | Replace the last two strict_strtoul() with kstrtoul(). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/mm: rename addressing_mode to s390_user_modeHeiko Carstens2012-09-261-4/+4
| | | | | | | | | | | | | Renaming the globally visible variable "user_mode" to "addressing_mode" in order to fix a name clash was not a good idea. (Commit 37fe1d73 "s390/mm: rename user_mode variable to addressing_mode") Looking at the code after a couple of weeks one thinks: addressing mode of what? So rename the variable again. This time to s390_user_mode. Which hopefully makes more sense. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* s390/mm: rename user_mode variable to addressing_modeHeiko Carstens2012-07-301-4/+5
| | | | | | | Fix name clash with user_mode() define which is also used in common code. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* Merge tag 'split-asm_system_h-for-linus-20120328' of ↵Linus Torvalds2012-03-291-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system Pull "Disintegrate and delete asm/system.h" from David Howells: "Here are a bunch of patches to disintegrate asm/system.h into a set of separate bits to relieve the problem of circular inclusion dependencies. I've built all the working defconfigs from all the arches that I can and made sure that they don't break. The reason for these patches is that I recently encountered a circular dependency problem that came about when I produced some patches to optimise get_order() by rewriting it to use ilog2(). This uses bitops - and on the SH arch asm/bitops.h drags in asm-generic/get_order.h by a circuituous route involving asm/system.h. The main difficulty seems to be asm/system.h. It holds a number of low level bits with no/few dependencies that are commonly used (eg. memory barriers) and a number of bits with more dependencies that aren't used in many places (eg. switch_to()). These patches break asm/system.h up into the following core pieces: (1) asm/barrier.h Move memory barriers here. This already done for MIPS and Alpha. (2) asm/switch_to.h Move switch_to() and related stuff here. (3) asm/exec.h Move arch_align_stack() here. Other process execution related bits could perhaps go here from asm/processor.h. (4) asm/cmpxchg.h Move xchg() and cmpxchg() here as they're full word atomic ops and frequently used by atomic_xchg() and atomic_cmpxchg(). (5) asm/bug.h Move die() and related bits. (6) asm/auxvec.h Move AT_VECTOR_SIZE_ARCH here. Other arch headers are created as needed on a per-arch basis." Fixed up some conflicts from other header file cleanups and moving code around that has happened in the meantime, so David's testing is somewhat weakened by that. We'll find out anything that got broken and fix it.. * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits) Delete all instances of asm/system.h Remove all #inclusions of asm/system.h Add #includes needed to permit the removal of asm/system.h Move all declarations of free_initmem() to linux/mm.h Disintegrate asm/system.h for OpenRISC Split arch_align_stack() out from asm-generic/system.h Split the switch_to() wrapper out of asm-generic/system.h Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h Create asm-generic/barrier.h Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h Disintegrate asm/system.h for Xtensa Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt] Disintegrate asm/system.h for Tile Disintegrate asm/system.h for Sparc Disintegrate asm/system.h for SH Disintegrate asm/system.h for Score Disintegrate asm/system.h for S390 Disintegrate asm/system.h for PowerPC Disintegrate asm/system.h for PA-RISC Disintegrate asm/system.h for MN10300 ...
| * Disintegrate asm/system.h for S390David Howells2012-03-281-1/+1
| | | | | | | | | | | | | | Disintegrate asm/system.h for S390. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-s390@vger.kernel.org
* | coredump: remove VM_ALWAYSDUMP flagJason Baron2012-03-241-8/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The motivation for this patchset was that I was looking at a way for a qemu-kvm process, to exclude the guest memory from its core dump, which can be quite large. There are already a number of filter flags in /proc/<pid>/coredump_filter, however, these allow one to specify 'types' of kernel memory, not specific address ranges (which is needed in this case). Since there are no more vma flags available, the first patch eliminates the need for the 'VM_ALWAYSDUMP' flag. The flag is used internally by the kernel to mark vdso and vsyscall pages. However, it is simple enough to check if a vma covers a vdso or vsyscall page without the need for this flag. The second patch then replaces the 'VM_ALWAYSDUMP' flag with a new 'VM_NODUMP' flag, which can be set by userspace using new madvise flags: 'MADV_DONTDUMP', and unset via 'MADV_DODUMP'. The core dump filters continue to work the same as before unless 'MADV_DONTDUMP' is set on the region. The qemu code which implements this features is at: http://people.redhat.com/~jbaron/qemu-dump/qemu-dump.patch In my testing the qemu core dump shrunk from 383MB -> 13MB with this patch. I also believe that the 'MADV_DONTDUMP' flag might be useful for security sensitive apps, which might want to select which areas are dumped. This patch: The VM_ALWAYSDUMP flag is currently used by the coredump code to indicate that a vma is part of a vsyscall or vdso section. However, we can determine if a vma is in one these sections by checking it against the gate_vma and checking for a non-NULL return value from arch_vma_name(). Thus, freeing a valuable vma bit. Signed-off-by: Jason Baron <jbaron@redhat.com> Acked-by: Roland McGrath <roland@hack.frob.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [S390] rework smp codeMartin Schwidefsky2012-03-111-21/+7
| | | | | | | | | | | | | | | | | | | | | Define struct pcpu and merge some of the NR_CPUS arrays into it, including __cpu_logical_map, current_set and smp_cpu_state. Split smp related functions to those operating on physical cpus and the functions operating on a logical cpu number. Make the functions for physical cpus use a pointer to a struct pcpu. This hides the knowledge about cpu addresses in smp.c, entry[64].S and swsusp_asm64.S, thus remove the sigp.h header. The PSW restart mechanism is used to start secondary cpus, calling a function on an online cpu, calling a function on the ipl cpu, and for the nmi signal. Replace the different assembler functions with a single function restart_int_handler. The new entry point calls a function whose pointer is stored in the lowcore of the target cpu and it can wait for the source cpu to stop. This covers all existing use cases. Overall the code is now simpler and there are ~380 lines less code. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* mm: arch: rename in_gate_area_no_task to in_gate_area_no_mmStephen Wilson2011-03-231-1/+1
| | | | | | | | | | | | | Now that gate vma's are referenced with respect to a particular mm and not a particular task it only makes sense to propagate the change to this predicate as well. Signed-off-by: Stephen Wilson <wilsons@start.ca> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* mm: arch: make in_gate_area take an mm_struct instead of a task_structStephen Wilson2011-03-231-1/+1
| | | | | | | | | | | | | | Morally, the question of whether an address lies in a gate vma should be asked with respect to an mm, not a particular task. Moreover, dropping the dependency on task_struct will help make existing and future operations on mm's more flexible and convenient. Signed-off-by: Stephen Wilson <wilsons@start.ca> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* mm: arch: make get_gate_vma take an mm_struct instead of a task_structStephen Wilson2011-03-231-1/+1
| | | | | | | | | | | | | Morally, the presence of a gate vma is more an attribute of a particular mm than a particular task. Moreover, dropping the dependency on task_struct will help make both existing and future operations on mm's more flexible and convenient. Signed-off-by: Stephen Wilson <wilsons@start.ca> Reviewed-by: Michel Lespinasse <walken@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [S390] vdso: dont map at mmap_baseHeiko Carstens2011-01-121-3/+1
| | | | | | | | | | | | | | The vdso object is currently always mapped with mm->mmap_base used as requested address. In case of flexible mmap layout this means it gets mapped above mmap_base and therefore potentially stealing a bit of address space that is reserved for the stack. In case of flexible mmap layout the object should be mapped below mmap base. For legacy mmap layout above. To fix this just don't request any specific address and let the mmap code figure out an address that fits. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] cleanup facility list handlingMartin Schwidefsky2010-10-251-5/+1
| | | | | | | Store the facility list once at system startup with stfl/stfle and reuse the result for all facility tests. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] vdso: remove redundant check for CONFIG_64BITChristoph Egger2010-05-171-4/+0
| | | | | | | | | | This is a check for CONFIG_64BIT inside a block that is only active when CONFIG_64BIT is set. So the check is actually useless and potentially irritating. Signed-off-by: Christoph Egger <siccegge@cs.fau.de> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Cleanup struct _lowcore usage and defines.Heiko Carstens2010-02-261-0/+1
| | | | | | | | | | | | Use asm offsets to make sure the offset defines to struct _lowcore and its layout don't get out of sync. Also add a BUILD_BUG_ON() which checks that the size of the structure is sane. And while being at it change those sites which use odd casts to access the current lowcore. These should use S390_lowcore instead. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Improve address space mode selection.Martin Schwidefsky2009-12-071-4/+5
| | | | | | | | | | | | | | | | | | Introduce user_mode to replace the two variables switch_amode and s390_noexec. There are three valid combinations of the old values: 1) switch_amode == 0 && s390_noexec == 0 2) switch_amode == 1 && s390_noexec == 0 3) switch_amode == 1 && s390_noexec == 1 They get replaced by 1) user_mode == HOME_SPACE_MODE 2) user_mode == PRIMARY_SPACE_MODE 3) user_mode == SECONDARY_SPACE_MODE The new kernel parameter user_mode=[primary,secondary,home] lets you choose the address space mode the user space processes should use. In addition the CONFIG_S390_SWITCH_AMODE config option is removed. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] perf_counter: fix vdso detectionHeiko Carstens2009-10-061-8/+8
| | | | | | | s390 version of f2053f1a "powerpc/perf_counter: Fix vdso detection". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* Use macros for .data.page_aligned section.Tim Abbott2009-09-211-1/+1
| | | | | | | | | | | | | | | | This patch changes the remaining direct references to .data.page_aligned in C and assembly code to use the macros in include/linux/linkage.h. Signed-off-by: Tim Abbott <tabbott@ksplice.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
* [S390] implement is_compat_taskHeiko Carstens2009-06-121-2/+2
| | | | | | | Implement is_compat_task and use it all over the place. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] vdso: kernel parameter syntaxMartin Schwidefsky2009-06-121-2/+13
| | | | | | | | | The syntax of the vdso kernel parameter is documented as vdso=[on|off]. The implementation uses vdso=[0|1], an invalid parameter string disables the vdso support. Fix the mismatch by adding vdso=[on|off] as additional parameter syntax. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] smp: fix memory leak on __cpu_upHeiko Carstens2009-03-261-2/+0
| | | | | | | | If sigp_set_prefix fails on __cpu_up we leak the lowcore structures and async+panic stacks for the failed cpu. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] vdso: compile fixHeiko Carstens2009-01-091-1/+2
| | | | | | | | | | | | | !CONFIG_SMP: arch/s390/kernel/vdso.c: In function 'vdso_init': arch/s390/kernel/vdso.c:325: error: incompatible type for argument 2 of 'vdso_alloc_per_cpu' Also move the code out of the BUG_ON statement since it won't be executed on !CONFIG_BUG. And that would be a bug. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [PATCH] fast vdso implementation for CLOCK_THREAD_CPUTIME_IDMartin Schwidefsky2008-12-311-3/+120
| | | | | | | | | | | The extract cpu time instruction (ectg) instruction allows the user process to get the current thread cputime without calling into the kernel. The code that uses the instruction needs to switch to the access registers mode to get access to the per-cpu info page that contains the two base values that are needed to calculate the current cputime from the CPU timer with the ectg instruction. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] introduce vdso on s390Martin Schwidefsky2008-12-251-0/+234
Add a vdso to speed up gettimeofday and clock_getres/clock_gettime for CLOCK_REALTIME/CLOCK_MONOTONIC. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>