summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/kernel/process.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
| * | powerpc: Explicitly disable math features when copying threadCyril Bur2016-03-021-0/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently when threads get scheduled off they always giveup the FPU, Altivec (VMX) and Vector (VSX) units if they were using them. When they are scheduled back on a fault is then taken to enable each facility and load registers. As a result explicitly disabling FPU/VMX/VSX has not been necessary. Future changes and optimisations remove this mandatory giveup and fault which could cause calls such as clone() and fork() to copy threads and run them later with FPU/VMX/VSX enabled but no registers loaded. This patch starts the process of having MSR_{FP,VEC,VSX} mean that a threads registers are hot while not having MSR_{FP,VEC,VSX} means that the registers must be loaded. This allows for a smarter return to userspace. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* / mm: ASLR: use get_random_long()Daniel Cashman2016-02-271-2/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace calls to get_random_int() followed by a cast to (unsigned long) with calls to get_random_long(). Also address shifting bug which, in case of x86 removed entropy mask for mmap_rnd_bits values > 31 bits. Signed-off-by: Daniel Cashman <dcashman@android.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: David S. Miller <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Nick Kralevich <nnk@google.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Mark Salyzyn <salyzyn@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'powerpc-4.4-3' into nextMichael Ellerman2015-12-141-0/+18
|\ | | | | | | | | | | | | | | | | | | Merge the two TM fixes we merged in 4.4. We are about to merge selftests for these, and without the fixes the selftests will oops. powerpc fixes for 4.4 #2 - tm: Block signal return from setting invalid MSR state from Michael Neuling - tm: Check for already reclaimed tasks from Michael Neuling
| * powerpc/tm: Check for already reclaimed tasksMichael Neuling2015-11-231-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we can hit a scenario where we'll tm_reclaim() twice. This results in a TM bad thing exception because the second reclaim occurs when not in suspend mode. The scenario in which this can happen is the following. We attempt to deliver a signal to userspace. To do this we need obtain the stack pointer to write the signal context. To get this stack pointer we must tm_reclaim() in case we need to use the checkpointed stack pointer (see get_tm_stackpointer()). Normally we'd then return directly to userspace to deliver the signal without going through __switch_to(). Unfortunatley, if at this point we get an error (such as a bad userspace stack pointer), we need to exit the process. The exit will result in a __switch_to(). __switch_to() will attempt to save the process state which results in another tm_reclaim(). This tm_reclaim() now causes a TM Bad Thing exception as this state has already been saved and the processor is no longer in TM suspend mode. Whee! This patch checks the state of the MSR to ensure we are TM suspended before we attempt the tm_reclaim(). If we've already saved the state away, we should no longer be in TM suspend mode. This has the additional advantage of checking for a potential TM Bad Thing exception. Found using syscall fuzzer. Fixes: fb09692e71f1 ("powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Print MSR TM bits in oops messagesMichael Neuling2015-12-141-8/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Print MSR TM bits in oops messages. This appends them to the end like this: MSR: 8000000502823031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[TE]> You get the TM[] only if at least one TM MSR bit is set. Inside the TM[], E means Enabled (bit 32), S means Suspended (bit 33), and T means Transactional (bit 34) If no bits are set, you get no TM[] output. Include rework of printbits() to handle this case. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Fix DSCR inheritance over fork()Anton Blanchard2015-12-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two DSCR tests have a hack in them: /* * XXX: Force a context switch out so that DSCR * current value is copied into the thread struct * which is required for the child to inherit the * changed value. */ sleep(1); We should not be working around this in the testcase, it is a kernel bug. Fix it by copying the current DSCR to the child, instead of what we had in the thread struct at last context switch. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Call restore_sprs() before _switch()Anton Blanchard2015-12-101-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 152d523e6307 ("powerpc: Create context switch helpers save_sprs() and restore_sprs()") moved the restore of SPRs after the call to _switch(). There is an issue with this approach - new tasks do not return through _switch(), they are set up by copy_thread() to directly return through ret_from_fork() or ret_from_kernel_thread(). This means restore_sprs() is not getting called for new tasks. Fix this by moving restore_sprs() before _switch(). Fixes: 152d523e6307 ("powerpc: Create context switch helpers save_sprs() and restore_sprs()") Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Call check_if_tm_restore_required() in enable_kernel_*()Anton Blanchard2015-12-101-3/+10
| | | | | | | | | | | | | | | | | | | | | | Commit a0e72cf12b1a ("powerpc: Create msr_check_and_{set,clear}()") removed a call to check_if_tm_restore_required() in the enable_kernel_*() functions. Add them back in. Fixes: a0e72cf12b1a ("powerpc: Create msr_check_and_{set,clear}()") Reported-by: Rashmica Gupta <rashmicy@gmail.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: clean up asm/switch_to.hAnton Blanchard2015-12-021-1/+1
| | | | | | | | | | | | | | | | Remove a bunch of unnecessary fallback functions and group things in a more logical way. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Rearrange __switch_to()Anton Blanchard2015-12-021-26/+26
| | | | | | | | | | | | | | | | | | Most of __switch_to() is housekeeping, TLB batching, timekeeping etc. Move these away from the more complex and critical context switching code. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: create flush_all_to_thread()Anton Blanchard2015-12-021-4/+18
| | | | | | | | | | | | | | | | Create a single function that flushes everything (FP, VMX, VSX, SPE). Doing this all at once means we only do one MSR write. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: create giveup_all()Anton Blanchard2015-12-021-15/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a single function that gives everything up (FP, VMX, VSX, SPE). Doing this all at once means we only do one MSR write. A context switch microbenchmark using yield(): http://ozlabs.org/~anton/junkcode/context_switch2.c ./context_switch2 --test=yield --fp --altivec --vector 0 0 shows an improvement of 3% on POWER8. Signed-off-by: Anton Blanchard <anton@samba.org> [mpe: giveup_all() needs to be EXPORT_SYMBOL'ed] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Remove fp_enable() and vec_enable(), use msr_check_and_{set, clear}()Anton Blanchard2015-12-011-2/+4
| | | | | | | | | | | | | | More consolidation of our MSR available bit handling. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Add ppc_strict_facility_enable boot optionAnton Blanchard2015-12-011-2/+15
| | | | | | | | | | | | | | | | | | Add a boot option that strictly manages the MSR unavailable bits. This catches kernel uses of FP/Altivec/SPE that would otherwise corrupt user state. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Create msr_check_and_{set,clear}()Anton Blanchard2015-12-011-55/+52
| | | | | | | | | | | | | | | | | | Create helper functions to set and clear MSR bits after first checking if they are already set. Grouping them will make it easy to avoid the MSR writes in a subsequent optimisation. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Move part of giveup_vsx into cAnton Blanchard2015-12-011-9/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the MSR modification into c. Removing it from the assembly function will allow us to avoid costly MSR writes by batching them up. Check the FP and VMX bits before calling the relevant giveup_*() function. This makes giveup_vsx() and flush_vsx_to_thread() perform more like their sister functions, and allows us to use flush_vsx_to_thread() in the signal code. Move the check_if_tm_restore_required() check in. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Move part of giveup_fpu,altivec,spe into cAnton Blanchard2015-12-011-4/+48
| | | | | | | | | | | | | | | | | | | | | | Move the MSR modification into new c functions. Removing it from the low level functions will allow us to avoid costly MSR writes by batching them up. Move the check_if_tm_restore_required() check into these new functions. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Create mtmsrd_isync()Anton Blanchard2015-12-011-8/+22
| | | | | | | | | | | | | | | | mtmsrd_isync() will do an mtmsrd followed by an isync on older processors. On newer processors we avoid the isync via a feature fixup. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Simplify TM restore checksAnton Blanchard2015-12-011-34/+19
| | | | | | | | | | | | | | | | | | | | | | | | Instead of having multiple giveup_*_maybe_transactional() functions, separate out the TM check into a new function called check_if_tm_restore_required(). This will make it easier to optimise the giveup_*() functions in a subsequent patch. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Remove UP only lazy floating point and vector optimisationsAnton Blanchard2015-12-011-112/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | The UP only lazy floating point and vector optimisations were written back when SMP was not common, and neither glibc nor gcc used vector instructions. Now SMP is very common, glibc aggressively uses vector instructions and gcc autovectorises. We want to add new optimisations that apply to both UP and SMP, but in preparation for that remove these UP only optimisations. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Create context switch helpers save_sprs() and restore_sprs()Anton Blanchard2015-12-011-12/+80
|/ | | | | | | | | | | | | | | | | | | | | | | | Move all our context switch SPR save and restore code into two helpers. We do a few optimisations: - Group all mfsprs and all mtsprs. In many cases an mtspr sets a scoreboarding bit that an mfspr waits on, so the current practise of mfspr A; mtspr A; mfpsr B; mtspr B is the worst scheduling we can do. - SPR writes are slow, so check that the value is changing before writing it. A context switch microbenchmark using yield(): http://ozlabs.org/~anton/junkcode/context_switch2.c ./context_switch2 --test=yield 0 0 shows an improvement of almost 10% on POWER8. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* Merge tag 'powerpc-4.3-1' of ↵Linus Torvalds2015-09-041-7/+7
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask from Benjamin Herrenschmidt - EEH fixes for SRIOV from Gavin - introduce rtas_get_sensor_fast() for IRQ handlers from Thomas Huth - use hardware RNG for arch_get_random_seed_* not arch_get_random_* from Paul Mackerras - seccomp filter support from Michael Ellerman - opal_cec_reboot2() handling for HMIs & machine checks from Mahesh Salgaonkar - add powerpc timebase as a trace clock source from Naveen N. Rao - misc cleanups in the xmon, signal & SLB code from Anshuman Khandual - add an inline function to update POWER8 HID0 from Gautham R. Shenoy - fix pte_pagesize_index() crash on 4K w/64K hash from Michael Ellerman - drop support for 64K local store on 4K kernels from Michael Ellerman - move dma_get_required_mask() from pnv_phb to pci_controller_ops from Andrew Donnellan - initialize distance lookup table from drconf path from Nikunj A Dadhania - enable RTC class support from Vaibhav Jain - disable automatically blocked PCI config from Gavin Shan - add LEDs driver for PowerNV platform from Vasant Hegde - fix endianness issues in the HVSI driver from Laurent Dufour - kexec endian fixes from Samuel Mendoza-Jonas - fix corrupted pdn list from Gavin Shan - fix fenced PHB caused by eeh_slot_error_detail() from Gavin Shan - Freescale updates from Scott: Highlights include 32-bit memcpy/memset optimizations, checksum optimizations, 85xx config fragments and updates, device tree updates, e6500 fixes for non-SMP, and misc cleanup and minor fixes. - a ton of cxl updates & fixes: - add explicit precision specifiers from Rasmus Villemoes - use more common format specifier from Rasmus Villemoes - destroy cxl_adapter_idr on module_exit from Johannes Thumshirn - destroy afu->contexts_idr on release of an afu from Johannes Thumshirn - compile with -Werror from Daniel Axtens - EEH support from Daniel Axtens - plug irq_bitmap getting leaked in cxl_context from Vaibhav Jain - add alternate MMIO error handling from Ian Munsie - allow release of contexts which have been OPENED but not STARTED from Andrew Donnellan - remove use of macro DEFINE_PCI_DEVICE_TABLE from Vaishali Thakkar - release irqs if memory allocation fails from Vaibhav Jain - remove racy attempt to force EEH invocation in reset from Daniel Axtens - fix + cleanup error paths in cxl_dev_context_init from Ian Munsie - fix force unmapping mmaps of contexts allocated through the kernel api from Ian Munsie - set up and enable PSL Timebase from Philippe Bergheaud * tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (140 commits) cxl: Set up and enable PSL Timebase cxl: Fix force unmapping mmaps of contexts allocated through the kernel api cxl: Fix + cleanup error paths in cxl_dev_context_init powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail() powerpc/pseries: Cleanup on pci_dn_reconfig_notifier() powerpc/pseries: Fix corrupted pdn list powerpc/powernv: Enable LEDS support powerpc/iommu: Set default DMA offset in dma_dev_setup cxl: Remove racy attempt to force EEH invocation in reset cxl: Release irqs if memory allocation fails cxl: Remove use of macro DEFINE_PCI_DEVICE_TABLE powerpc/powernv: Fix mis-merge of OPAL support for LEDS driver powerpc/powernv: Reset HILE before kexec_sequence() powerpc/kexec: Reset secondary cpu endianness before kexec powerpc/hvsi: Fix endianness issues in the HVSI driver leds/powernv: Add driver for PowerNV platform powerpc/powernv: Create LED platform device powerpc/powernv: Add OPAL interfaces for accessing and modifying system LED states powerpc/powernv: Fix the log message when disabling VF cxl: Allow release of contexts which have been OPENED but not STARTED ...
| * powerpc/tm: Drop tm_orig_msr from thread_structAnshuman Khandual2015-07-161-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | Currently tm_orig_msr is getting used during process context switch only. Then there is ckpt_regs which saves the checkpointed userspace context The MSR slot contained in ckpt_regs structure can be used during process context switch instead of tm_orig_msr, thus allowing us to drop it from thread_struct structure. This patch does that change. Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Uncomment and make enable_kernel_vsx() routine availableLeonidas Da Silva Barbosa2015-07-141-3/+0
|/ | | | | | | | | | | enable_kernel_vsx() function was commented since anything was using it. However, vmx-crypto driver uses VSX instructions which are only available if VSX is enable. Otherwise it rises an exception oops. This patch uncomment enable_kernel_vsx() routine and makes it available. Signed-off-by: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* powerpc/kernel: Remove the unused extern dscr_defaultAnshuman Khandual2015-06-071-1/+0
| | | | | | | | | | | The process context switch code no longer uses dscr_default variable from the sysfs.c file. The variable became unused when we started storing the CPU specific DSCR value in the PACA structure instead. This patch just removes this extern declaration. It was originally added by the following commit. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/kernel: Rename copy_thread() 'arg' argument to 'kthread_arg'Alex Dowad2015-03-201-2/+7
| | | | | | | | The 'arg' argument to copy_thread() is only ever used when forking a new kernel thread. Hence, rename it to 'kthread_arg' for clarity. Signed-off-by: Alex Dowad <alexinbeijing@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Use generic PIE randomizationVineeth Vijayan2014-11-171-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Back in 2009 we merged 501cb16d3cfd "Randomise PIEs", which added support for randomizing PIE (Position Independent Executable) binaries. That commit added randomize_et_dyn(), which correctly randomized the addresses, but failed to honor PF_RANDOMIZE. That means it was not possible to disable PIE randomization via the personality flag, or /proc/sys/kernel/randomize_va_space. Since then there has been generic support for PIE randomization added to binfmt_elf.c, selectable via ARCH_BINFMT_ELF_RANDOMIZE_PIE. Enabling that allows us to drop randomize_et_dyn(), which means we start honoring PF_RANDOMIZE correctly. It also causes a fairly major change to how we layout PIE binaries. Currently we will place the binary at 512MB-520MB for 32 bit binaries, or 512MB-1.5GB for 64 bit binaries, eg: $ cat /proc/$$/maps 4e550000-4e580000 r-xp 00000000 08:02 129813 /bin/dash 4e580000-4e590000 rw-p 00020000 08:02 129813 /bin/dash 10014110000-10014140000 rw-p 00000000 00:00 0 [heap] 3fffaa3f0000-3fffaa5a0000 r-xp 00000000 08:02 921 /lib/powerpc64le-linux-gnu/libc-2.19.so 3fffaa5a0000-3fffaa5b0000 rw-p 001a0000 08:02 921 /lib/powerpc64le-linux-gnu/libc-2.19.so 3fffaa5c0000-3fffaa5d0000 rw-p 00000000 00:00 0 3fffaa5d0000-3fffaa5f0000 r-xp 00000000 00:00 0 [vdso] 3fffaa5f0000-3fffaa620000 r-xp 00000000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so 3fffaa620000-3fffaa630000 rw-p 00020000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so 3ffffc340000-3ffffc370000 rw-p 00000000 00:00 0 [stack] With this commit applied we don't do any special randomisation for the binary, and instead rely on mmap randomisation. This means the binary ends up at high addresses, eg: $ cat /proc/$$/maps 3fff99820000-3fff999d0000 r-xp 00000000 08:02 921 /lib/powerpc64le-linux-gnu/libc-2.19.so 3fff999d0000-3fff999e0000 rw-p 001a0000 08:02 921 /lib/powerpc64le-linux-gnu/libc-2.19.so 3fff999f0000-3fff99a00000 rw-p 00000000 00:00 0 3fff99a00000-3fff99a20000 r-xp 00000000 00:00 0 [vdso] 3fff99a20000-3fff99a50000 r-xp 00000000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so 3fff99a50000-3fff99a60000 rw-p 00020000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so 3fff99a60000-3fff99a90000 r-xp 00000000 08:02 129813 /bin/dash 3fff99a90000-3fff99aa0000 rw-p 00020000 08:02 129813 /bin/dash 3fffc3de0000-3fffc3e10000 rw-p 00000000 00:00 0 [stack] 3fffc55e0000-3fffc5610000 rw-p 00000000 00:00 0 [heap] Although this should be OK, it's possible it might break badly written binaries that make assumptions about the address space layout. Signed-off-by: Vineeth Vijayan <vvijayan@mvista.com> [mpe: Rewrite changelog] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/ftrace: Remove mod_return_to_handlerAnton Blanchard2014-11-091-8/+1
| | | | | | | | | mod_return_to_handler is the same as return_to_handler, except it handles the change of the TOC (r2). Add this into return_to_handler and remove mod_return_to_handler. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Use probe_kernel_address in show_instructionsAnton Blanchard2014-11-051-6/+2
| | | | | | | | We really don't want to take a pagefault in show_instructions, so use probe_kernel_address instead of __get_user. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Replace __get_cpu_var usesChristoph Lameter2014-11-031-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This still has not been merged and now powerpc is the only arch that does not have this change. Sorry about missing linuxppc-dev before. V2->V2 - Fix up to work against 3.18-rc1 __get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. At the end of the patch set all uses of __get_cpu_var have been removed so the macro is removed too. The patch set includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> Signed-off-by: Christoph Lameter <cl@linux.com> [mpe: Fix build errors caused by set/or_softirq_pending(), and rework assignment in __set_breakpoint() to use memcpy().] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Rename __get_SP() to current_stack_pointer()Anton Blanchard2014-10-151-1/+1
| | | | | | | | Michael points out that __get_SP() is a pretty horrible function name. Let's give it a better name. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Reimplement __get_SP() as a function not a defineAnton Blanchard2014-10-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Li Zhong points out an issue with our current __get_SP() implementation. If ftrace function tracing is enabled (ie -pg profiling using _mcount) we spill a stack frame on 64bit all the time. If a function calls __get_SP() and later calls a function that is tail call optimised, we will pop the stack frame and the value returned by __get_SP() is no longer valid. An example from Li can be found in save_stack_trace -> save_context_stack: c0000000000432c0 <.save_stack_trace>: c0000000000432c0: mflr r0 c0000000000432c4: std r0,16(r1) c0000000000432c8: stdu r1,-128(r1) <-- stack frame for _mcount c0000000000432cc: std r3,112(r1) c0000000000432d0: bl <._mcount> c0000000000432d4: nop c0000000000432d8: mr r4,r1 <-- __get_SP() c0000000000432dc: ld r5,632(r13) c0000000000432e0: ld r3,112(r1) c0000000000432e4: li r6,1 c0000000000432e8: addi r1,r1,128 <-- pop stack frame c0000000000432ec: ld r0,16(r1) c0000000000432f0: mtlr r0 c0000000000432f4: b <.save_context_stack> <-- tail call optimized save_context_stack ends up with a stack pointer below the current one, and it is likely to be scribbled over. Fix this by making __get_SP() a function which returns the callers stack frame. Also replace inline assembly which grabs the stack pointer in save_stack_trace and show_stack with __get_SP(). This also fixes an issue with perf_arch_fetch_caller_regs(). It currently unwinds the stack once, which will skip a valid stack frame on a leaf function. With the __get_SP() fixes in this patch, we never need to unwind the stack frame to get to the first interesting frame. We have to export __get_SP() because perf_arch_fetch_caller_regs() (which is used in modules) calls it from a header file. Reported-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Move more symbol exports next to function definitionsAnton Blanchard2014-09-251-0/+2
| | | | | Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Reduce scariness of interrupt frames in stack tracesPaul Mackerras2014-08-051-1/+1
| | | | | | | | | | | | Some people see things like "Exception: 501" in stack traces in dmesg and assume that means that something has gone badly wrong, when in fact "Exception: 501" just means a device interrupt was taken. This changes "Exception" to "interrupt" to make it clearer that we are just recording the fact of a change in control flow rather than some error condition. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Pull out ksp_vsid logic into a helperMichael Ellerman2014-07-281-14/+18
| | | | | | | | The previous patch left a bit of a wart in copy_process(). Clean it up a bit by moving the logic out into a helper. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Remove MMU_FTR_SLBMichael Ellerman2014-07-281-1/+1
| | | | | | | | We now only support cpus that use an SLB, so we don't need an MMU feature to indicate that. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Correct DSCR during TM context switchSam bobroff2014-06-111-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Correct the DSCR SPR becoming temporarily corrupted if a task is context switched during a transaction. The problem occurs while suspending the task and is caused by saving the DSCR to thread.dscr after it has already been set to the CPU's default value: __switch_to() calls __switch_to_tm() which calls tm_reclaim_task() which calls tm_reclaim_thread() which calls tm_reclaim() where the DSCR is set to the CPU's default __switch_to() calls _switch() where thread.dscr is set to the DSCR When the task is resumed, it's transaction will be doomed (as usual) and the DSCR SPR will be corrupted, although the checkpointed value will be correct. Therefore the DSCR will be immediately corrected by the transaction aborting, unless it has been suspended. In that case the incorrect value can be seen by the task until it resumes the transaction. The fix is to treat the DSCR similarly to the TAR and save it early in __switch_to(). A program exposing the problem is added to the kernel self tests as: tools/testing/selftests/powerpc/tm/tm-resched-dscr. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> CC: <stable@vger.kernel.org> [v3.10+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Fix smp_processor_id() in preemptible splat in set_breakpointPaul Gortmaker2014-05-201-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, on 8641D, which doesn't set CONFIG_HAVE_HW_BREAKPOINT we get the following splat: BUG: using smp_processor_id() in preemptible [00000000] code: login/1382 caller is set_breakpoint+0x1c/0xa0 CPU: 0 PID: 1382 Comm: login Not tainted 3.15.0-rc3-00041-g2aafe1a4d451 #1 Call Trace: [decd5d80] [c0008dc4] show_stack+0x50/0x158 (unreliable) [decd5dc0] [c03c6fa0] dump_stack+0x7c/0xdc [decd5de0] [c01f8818] check_preemption_disabled+0xf4/0x104 [decd5e00] [c00086b8] set_breakpoint+0x1c/0xa0 [decd5e10] [c00d4530] flush_old_exec+0x2bc/0x588 [decd5e40] [c011c468] load_elf_binary+0x2ac/0x1164 [decd5ec0] [c00d35f8] search_binary_handler+0xc4/0x1f8 [decd5ef0] [c00d4ee8] do_execve+0x3d8/0x4b8 [decd5f40] [c001185c] ret_from_syscall+0x0/0x38 --- Exception: c01 at 0xfeee554 LR = 0xfeee7d4 The call path in this case is: flush_thread --> set_debug_reg_defaults --> set_breakpoint --> __get_cpu_var Since preemption is enabled in the cleanup of flush thread, and there is no need to disable it, introduce the distinction between set_breakpoint and __set_breakpoint, leaving only the flush_thread instance as the current user of set_breakpoint. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Drop return value from set_breakpoint as it is unusedPaul Gortmaker2014-05-201-4/+4
| | | | | | | | None of the callers check the return value, so it might as well not have one at all. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Fix kernel thread creation on ABIv2Anton Blanchard2014-04-231-12/+5
| | | | | | | | | | Change how we setup registers for ret_from_kernel_thread. In ABIv1, instead of passing a function descriptor in, dereference it and pass the target in directly. Use ppc_global_function_entry to get it right on both ABIv1 and ABIv2. Signed-off-by: Anton Blanchard <anton@samba.org>
* powerpc/tm: Disable IRQ in tm_recheckpointMichael Neuling2014-04-071-6/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can't take an IRQ when we're about to do a trechkpt as our GPR state is set to user GPR values. We've hit this when running some IBM Java stress tests in the lab resulting in the following dump: cpu 0x3f: Vector: 700 (Program Check) at [c000000007eb3d40] pc: c000000000050074: restore_gprs+0xc0/0x148 lr: 00000000b52a8184 sp: ac57d360 msr: 8000000100201030 current = 0xc00000002c500000 paca = 0xc000000007dbfc00 softe: 0 irq_happened: 0x00 pid = 34535, comm = Pooled Thread # R00 = 00000000b52a8184 R16 = 00000000b3e48fda R01 = 00000000ac57d360 R17 = 00000000ade79bd8 R02 = 00000000ac586930 R18 = 000000000fac9bcc R03 = 00000000ade60000 R19 = 00000000ac57f930 R04 = 00000000f6624918 R20 = 00000000ade79be8 R05 = 00000000f663f238 R21 = 00000000ac218a54 R06 = 0000000000000002 R22 = 000000000f956280 R07 = 0000000000000008 R23 = 000000000000007e R08 = 000000000000000a R24 = 000000000000000c R09 = 00000000b6e69160 R25 = 00000000b424cf00 R10 = 0000000000000181 R26 = 00000000f66256d4 R11 = 000000000f365ec0 R27 = 00000000b6fdcdd0 R12 = 00000000f66400f0 R28 = 0000000000000001 R13 = 00000000ada71900 R29 = 00000000ade5a300 R14 = 00000000ac2185a8 R30 = 00000000f663f238 R15 = 0000000000000004 R31 = 00000000f6624918 pc = c000000000050074 restore_gprs+0xc0/0x148 cfar= c00000000004fe28 dont_restore_vec+0x1c/0x1a4 lr = 00000000b52a8184 msr = 8000000100201030 cr = 24804888 ctr = 0000000000000000 xer = 0000000000000000 trap = 700 This moves tm_recheckpoint to a C function and moves the tm_restore_sprs into that function. It then adds IRQ disabling over the trechkpt critical section. It also sets the TEXASR FS in the signals code to ensure this is never set now that we explictly write the TM sprs in tm_recheckpoint. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/tm: Fix crash when forking inside a transactionMichael Neuling2014-03-071-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we fork/clone we currently don't copy any of the TM state to the new thread. This results in a TM bad thing (program check) when the new process is switched in as the kernel does a tmrechkpt with TEXASR FS not set. Also, since R1 is from userspace, we trigger the bad kernel stack pointer detection. So we end up with something like this: Bad kernel stack pointer 0 at c0000000000404fc cpu 0x2: Vector: 700 (Program Check) at [c00000003ffefd40] pc: c0000000000404fc: restore_gprs+0xc0/0x148 lr: 0000000000000000 sp: 0 msr: 9000000100201030 current = 0xc000001dd1417c30 paca = 0xc00000000fe00800 softe: 0 irq_happened: 0x01 pid = 0, comm = swapper/2 WARNING: exception is not recoverable, can't continue The below fixes this by flushing the TM state before we copy the task_struct to the clone. To do this we go through the tmreclaim patch, which removes the checkpointed registers from the CPU and transitions the CPU out of TM suspend mode. Hence we need to call tmrechkpt after to restore the checkpointed state and the TM mode for the current task. To make this fail from userspace is simply: tbegin li r0, 2 sc <boom> Kudos to Adhemerval Zanella Neto for finding this. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: Adhemerval Zanella Neto <azanella@br.ibm.com> cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Fix hw breakpoints on !HAVE_HW_BREAKPOINT configurationsAndreas Schwab2014-01-291-1/+1
| | | | | | | | This fixes a logic error that caused a failure to update the hw breakpoint registers when not using the hw-breakpoint interface. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Merge branch 'next' of ↵Linus Torvalds2014-01-281-15/+162
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Ben Herrenschmidt: "So here's my next branch for powerpc. A bit late as I was on vacation last week. It's mostly the same stuff that was in next already, I just added two patches today which are the wiring up of lockref for powerpc, which for some reason fell through the cracks last time and is trivial. The highlights are, in addition to a bunch of bug fixes: - Reworked Machine Check handling on kernels running without a hypervisor (or acting as a hypervisor). Provides hooks to handle some errors in real mode such as TLB errors, handle SLB errors, etc... - Support for retrieving memory error information from the service processor on IBM servers running without a hypervisor and routing them to the memory poison infrastructure. - _PAGE_NUMA support on server processors - 32-bit BookE relocatable kernel support - FSL e6500 hardware tablewalk support - A bunch of new/revived board support - FSL e6500 deeper idle states and altivec powerdown support You'll notice a generic mm change here, it has been acked by the relevant authorities and is a pre-req for our _PAGE_NUMA support" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (121 commits) powerpc: Implement arch_spin_is_locked() using arch_spin_value_unlocked() powerpc: Add support for the optimised lockref implementation powerpc/powernv: Call OPAL sync before kexec'ing powerpc/eeh: Escalate error on non-existing PE powerpc/eeh: Handle multiple EEH errors powerpc: Fix transactional FP/VMX/VSX unavailable handlers powerpc: Don't corrupt transactional state when using FP/VMX in kernel powerpc: Reclaim two unused thread_info flag bits powerpc: Fix races with irq_work Move precessing of MCE queued event out from syscall exit path. pseries/cpuidle: Remove redundant call to ppc64_runlatch_off() in cpu idle routines powerpc: Make add_system_ram_resources() __init powerpc: add SATA_MV to ppc64_defconfig powerpc/powernv: Increase candidate fw image size powerpc: Add debug checks to catch invalid cpu-to-node mappings powerpc: Fix the setup of CPU-to-Node mappings during CPU online powerpc/iommu: Don't detach device without IOMMU group powerpc/eeh: Hotplug improvement powerpc/eeh: Call opal_pci_reinit() on powernv for restoring config space powerpc/eeh: Add restore_config operation ...
| * Merge remote-tracking branch 'scott/next' into nextBenjamin Herrenschmidt2014-01-151-2/+28
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | Freescale updates from Scott: << Highlights include 32-bit booke relocatable support, e6500 hardware tablewalk support, various e500 SPE fixes, some new/revived boards, and e6500 deeper idle and altivec powerdown modes. >>
| | * powerpc: fix exception clearing in e500 SPE float emulationJoseph Myers2014-01-081-2/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The e500 SPE floating-point emulation code clears existing exceptions (__FPU_FPSCR &= ~FP_EX_MASK;) before ORing in the exceptions from the emulated operation. However, these exception bits are the "sticky", cumulative exception bits, and should only be cleared by the user program setting SPEFSCR, not implicitly by any floating-point instruction (whether executed purely by the hardware or emulated). The spurious clearing of these bits shows up as missing exceptions in glibc testing. Fixing this, however, is not as simple as just not clearing the bits, because while the bits may be from previous floating-point operations (in which case they should not be cleared), the processor can also set the sticky bits itself before the interrupt for an exception occurs, and this can happen in cases when IEEE 754 semantics are that the sticky bit should not be set. Specifically, the "invalid" sticky bit is set in various cases with non-finite operands, where IEEE 754 semantics do not involve raising such an exception, and the "underflow" sticky bit is set in cases of exact underflow, whereas IEEE 754 semantics are that this flag is set only for inexact underflow. Thus, for correct emulation the kernel needs to know the setting of these two sticky bits before the instruction being emulated. When a floating-point operation raises an exception, the kernel can note the state of the sticky bits immediately afterwards. Some <fenv.h> functions that affect the state of these bits, such as fesetenv and feholdexcept, need to use prctl with PR_GET_FPEXC and PR_SET_FPEXC anyway, and so it is natural to record the state of those bits during that call into the kernel and so avoid any need for a separate call into the kernel to inform it of a change to those bits. Thus, the interface I chose to use (in this patch and the glibc port) is that one of those prctl calls must be made after any userspace change to those sticky bits, other than through a floating-point operation that traps into the kernel anyway. feclearexcept and fesetexceptflag duly make those calls, which would not be required were it not for this issue. The previous EGLIBC port, and the uClibc code copied from it, is fundamentally broken as regards any use of prctl for floating-point exceptions because it didn't use the PR_FP_EXC_SW_ENABLE bit in its prctl calls (and did various worse things, such as passing a pointer when prctl expected an integer). If you avoid anything where prctl is used, the clearing of sticky bits still means it will never give anything approximating correct exception semantics with existing kernels. I don't believe the patch makes things any worse for existing code that doesn't try to inform the kernel of changes to sticky bits - such code may get incorrect exceptions in some cases, but it would have done so anyway in other cases. Signed-off-by: Joseph Myers <joseph@codesourcery.com> Signed-off-by: Scott Wood <scottwood@freescale.com>
| * | powerpc: Don't corrupt transactional state when using FP/VMX in kernelPaul Mackerras2014-01-151-12/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when we have a process using the transactional memory facilities on POWER8 (that is, the processor is in transactional or suspended state), and the process enters the kernel and the kernel then uses the floating-point or vector (VMX/Altivec) facility, we end up corrupting the user-visible FP/VMX/VSX state. This happens, for example, if a page fault causes a copy-on-write operation, because the copy_page function will use VMX to do the copy on POWER8. The test program below demonstrates the bug. The bug happens because when FP/VMX state for a transactional process is stored in the thread_struct, we store the checkpointed state in .fp_state/.vr_state and the transactional (current) state in .transact_fp/.transact_vr. However, when the kernel wants to use FP/VMX, it calls enable_kernel_fp() or enable_kernel_altivec(), which saves the current state in .fp_state/.vr_state. Furthermore, when we return to the user process we return with FP/VMX/VSX disabled. The next time the process uses FP/VMX/VSX, we don't know which set of state (the current register values, .fp_state/.vr_state, or .transact_fp/.transact_vr) we should be using, since we have no way to tell if we are still in the same transaction, and if not, whether the previous transaction succeeded or failed. Thus it is necessary to strictly adhere to the rule that if FP has been enabled at any point in a transaction, we must keep FP enabled for the user process with the current transactional state in the FP registers, until we detect that it is no longer in a transaction. Similarly for VMX; once enabled it must stay enabled until the process is no longer transactional. In order to keep this rule, we add a new thread_info flag which we test when returning from the kernel to userspace, called TIF_RESTORE_TM. This flag indicates that there is FP/VMX/VSX state to be restored before entering userspace, and when it is set the .tm_orig_msr field in the thread_struct indicates what state needs to be restored. The restoration is done by restore_tm_state(). The TIF_RESTORE_TM bit is set by new giveup_fpu/altivec_maybe_transactional helpers, which are called from enable_kernel_fp/altivec, giveup_vsx, and flush_fp/altivec_to_thread instead of giveup_fpu/altivec. The other thing to be done is to get the transactional FP/VMX/VSX state from .fp_state/.vr_state when doing reclaim, if that state has been saved there by giveup_fpu/altivec_maybe_transactional. Having done this, we set the FP/VMX bit in the thread's MSR after reclaim to indicate that that part of the state is now valid (having been reclaimed from the processor's checkpointed state). Finally, in the signal handling code, we move the clearing of the transactional state bits in the thread's MSR a bit earlier, before calling flush_fp_to_thread(), so that we don't unnecessarily set the TIF_RESTORE_TM bit. This is the test program: /* Michael Neuling 4/12/2013 * * See if the altivec state is leaked out of an aborted transaction due to * kernel vmx copy loops. * * gcc -m64 htm_vmxcopy.c -o htm_vmxcopy * */ /* We don't use all of these, but for reference: */ int main(int argc, char *argv[]) { long double vecin = 1.3; long double vecout; unsigned long pgsize = getpagesize(); int i; int fd; int size = pgsize*16; char tmpfile[] = "/tmp/page_faultXXXXXX"; char buf[pgsize]; char *a; uint64_t aborted = 0; fd = mkstemp(tmpfile); assert(fd >= 0); memset(buf, 0, pgsize); for (i = 0; i < size; i += pgsize) assert(write(fd, buf, pgsize) == pgsize); unlink(tmpfile); a = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0); assert(a != MAP_FAILED); asm __volatile__( "lxvd2x 40,0,%[vecinptr] ; " // set 40 to initial value TBEGIN "beq 3f ;" TSUSPEND "xxlxor 40,40,40 ; " // set 40 to 0 "std 5, 0(%[map]) ;" // cause kernel vmx copy page TABORT TRESUME TEND "li %[res], 0 ;" "b 5f ;" "3: ;" // Abort handler "li %[res], 1 ;" "5: ;" "stxvd2x 40,0,%[vecoutptr] ; " : [res]"=r"(aborted) : [vecinptr]"r"(&vecin), [vecoutptr]"r"(&vecout), [map]"r"(a) : "memory", "r0", "r3", "r4", "r5", "r6", "r7"); if (aborted && (vecin != vecout)){ printf("FAILED: vector state leaked on abort %f != %f\n", (double)vecin, (double)vecout); exit(1); } munmap(a, size); close(fd); printf("PASSED!\n"); return 0; } Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | powerpc: Delete non-required instances of include <linux/init.h>Paul Gortmaker2014-01-151-1/+0
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | None of these files are actually using any __init type directives and hence don't need to include <linux/init.h>. Most are just a left over from __devinit and __cpuinit removal, or simply due to code getting copied from one driver to the next. The one instance where we add an include for init.h covers off a case where that file was implicitly getting it from another header which itself didn't need it. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | Merge tag 'signed-for-3.13' of git://github.com/agraf/linux-2.6 into kvm-masterPaolo Bonzini2013-12-201-16/+16
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch queue for 3.13 - 2013-12-18 This fixes some grave issues we've only found after 3.13-rc1: - Make the modularized HV/PR book3s kvm work well as modules - Fix some race conditions - Fix compilation with certain compilers (booke) - Fix THP for book3s_hv - Fix preemption for book3s_pr Alexander Graf (4): KVM: PPC: Book3S: PR: Don't clobber our exit handler id KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy KVM: PPC: Book3S: PR: Enable interrupts earlier Aneesh Kumar K.V (1): powerpc: book3s: kvm: Don't abuse host r2 in exit path Paul Mackerras (5): KVM: PPC: Book3S HV: Fix physical address calculations KVM: PPC: Book3S HV: Refine barriers in guest entry/exit KVM: PPC: Book3S HV: Make tbacct_lock irq-safe KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call KVM: PPC: Book3S HV: Don't drop low-order page address bits Scott Wood (1): powerpc/kvm/booke: Fix build break due to stack frame size warning pingfan liu (1): powerpc: kvm: fix rare but potential deadlock scene
| * powerpc/kvm/booke: Fix build break due to stack frame size warningScott Wood2013-12-111-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit ce11e48b7fdd256ec68b932a89b397a790566031 ("KVM: PPC: E500: Add userspace debug stub support") added "struct thread_struct" to the stack of kvmppc_vcpu_run(). thread_struct is 1152 bytes on my build, compared to 48 bytes for the recently-introduced "struct debug_reg". Use the latter instead. This fixes the following error: cc1: warnings being treated as errors arch/powerpc/kvm/booke.c: In function 'kvmppc_vcpu_run': arch/powerpc/kvm/booke.c:760:1: error: the frame size of 1424 bytes is larger than 1024 bytes make[2]: *** [arch/powerpc/kvm/booke.o] Error 1 make[1]: *** [arch/powerpc/kvm] Error 2 make[1]: *** Waiting for unfinished jobs.... Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>