linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc	Linus Torvalds	2018-10-24	11	-107/+123
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull sparc updates from David Miller: "Mostly VDSO cleanups and optimizations" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: Several small VDSO vclock_gettime.c improvements. sparc: Validate VDSO for undefined symbols. sparc: Really use linker with LDFLAGS. sparc: Improve VDSO CFLAGS. sparc: Set DISABLE_BRANCH_PROFILING in VDSO CFLAGS. sparc: Don't bother masking out TICK_PRIV_BIT in VDSO code. sparc: Inline VDSO gettime code aggressively. sparc: Improve VDSO instruction patching. sparc: Fix parport build warnings.
\| *	Merge branch 'sparc-vdso'	David S. Miller	2018-10-23	10	-107/+121
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sparc: VDSO improvements I started out on these changes with the goal of improving perf annotations when the VDSO is in use. Due to lack of inlining the helper functions are typically hit when profiling instead of __vdso_gettimeoday() or __vdso_vclock_gettime(). The only symbols available by default are the dyanmic symbols, which therefore doesn't cover the helper functions. So the perf output looks terrible, because the symbols cannot be resolved and all show up as "Unknown". The sparc VDSO code forces no inlining because of the way the simplistic %tick register read code patching works. So fixing that was the first order of business. Tricks were taken from how x86 implements alternates. The crucial factor is that if you want to refer to locations (for the original and patch instruction(s)) you have to do so in a way that is resolvable at link time even for a shared object. So you have to do this by storing PC-relative values, and not in executable sections. Next, we sanitize the Makefile so that the cflags et al. make more sense. And LDFLAGS are applied actually to invocations of LD instead of CC. We also add some sanity checking, specifically in a post-link check that makes sure we don't have any unexpected unresolved symbols in the VDSO. This is essential because the dynamic linker cannot resolve symbols in the VDSO because it cannot write to it. Finally some very minor optimizations are preformed to the vclock_gettime.c code. One thing which is tricky with this code on sparc is that struct timeval and struct timespec are layed out differently on 64-bit. This is because, unlike other architectures, sparc defined suseconds_t as 'int' even on 64-bit. This is why we have all of the "union" tstv_t" business and the weird assignments in __vdso_gettimeofday(). Performance wise we do gain some cycle shere, specifically here are cycle counts for a user application calling gettimeofday(): no-VDSO VDSO-orig VDSO-new ================================================ 64-bit 853 cycles 112 cycles 125 cycles 32-bit 849 cycles 134 cycles 141 cycles These results are with current glibc sources. To get better we'd need to implement this in assembler, and I might just do that at some point. Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	sparc: Several small VDSO vclock_gettime.c improvements.	David S. Miller	2018-10-23	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Almost entirely borrowed from the x86 code. Main improvement is to avoid having to initialize ts->tv_nsec to zero before the sequence loops, by expanding timespec_add_ns(). Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	sparc: Validate VDSO for undefined symbols.	David S. Miller	2018-10-23	3	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There should be no undefined symbols in the resulting VDSO image(s). On sparc, fixed register usage can result in undefined symbols ending up in the image. To combat this, we do two things: 1) Define current_thread_info() specially when BUILD_DSO. 2) Ignore "#scratch" register undefined symbols in the output. Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	sparc: Really use linker with LDFLAGS.	David S. Miller	2018-10-23	1	-9/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than funneling through CC. Also, use --hash-style=both just like other VDSO architectures and glibc do. Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	sparc: Improve VDSO CFLAGS.	David S. Miller	2018-10-23	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do not set any special register usage options, use the default which is exactly what we should use for userspace code. Make sure we remove the gcc plugin options from the 64-bit build. The 32-bit cflags got it right already. Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	sparc: Set DISABLE_BRANCH_PROFILING in VDSO CFLAGS.	David S. Miller	2018-10-23	2	-6/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Not in vclock_gettime.c itself. Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	sparc: Don't bother masking out TICK_PRIV_BIT in VDSO code.	David S. Miller	2018-10-23	1	-8/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the TICK_PRIV_BIT was set, we would not be able to read the tick register in user space, which is where this code runs. Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	sparc: Inline VDSO gettime code aggressively.	David S. Miller	2018-10-23	1	-22/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One interesting thing we need to do is stop using __builtin_return_address() in get_vvar_data(). Simply read the %pc register instead. Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	sparc: Improve VDSO instruction patching.	David S. Miller	2018-10-23	7	-52/+68
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current VDSO patch mechanism has several problems: 1) It assumes how gcc will emit a function, with a register window, an initial save instruction and then immediately the %tick read when compiling vread_tick(). There is no such guarantees, code generation could change at any time, gcc could put a nop between the save and the %tick read, etc. So this is extremely fragile and would fail some day. 2) It disallows us to properly inline vread_tick() into the callers and thus get the best possible code sequences. So fix this to patch properly, with location based annotations. We have to be careful because we cannot do it the way we do patches elsewhere in the kernel. Those use a sequence like: 1: insn .section .whatever_patch, "ax" .word 1b replacement_insn .previous This is a dynamic shared object, so that .word cannot be resolved at build time, and thus cannot be used to execute the patches when the kernel initializes the images. Even trying to use label difference equations doesn't work in the above kind of scheme: 1: insn .section .whatever_patch, "ax" .word . - 1b replacement_insn .previous The assembler complains that it cannot resolve that computation. The issue is that this is contained in an executable section. Borrow the sequence used by x86 alternatives, which is: 1: insn .pushsection .whatever_patch, "a" .word . - 1b, . - 1f .popsection .pushsection .whatever_patch_replacements, "ax" 1: replacement_insn .previous This works, allows us to inline vread_tick() as much as we like, and can be used for arbitrary kinds of VDSO patching in the future. Also, reverse the condition for patching. Most systems are %stick based, so if we only patch on %tick systems the patching code will get little or no testing. Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	sparc: Fix parport build warnings.	David S. Miller	2018-10-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If PARPORT_PC_FIFO is not enabled, do not provide the dma lock macros and lock definition. Otherwise: ./arch/sparc/include/asm/parport.h:24:24: warning: ‘dma_spin_lock’ defined but not used [-Wunused-variable] static DEFINE_SPINLOCK(dma_spin_lock); ^~~~~~~~~~~~~ ./include/linux/spinlock_types.h:81:39: note: in definition of macro ‘DEFINE_SPINLOCK’ #define DEFINE_SPINLOCK(x) spinlock_t x = __SPIN_LOCK_UNLOCKED(x) Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge branch 'parisc-4.20-1' of ↵	Linus Torvalds	2018-10-23	32	-414/+721
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc updates from Helge Deller: "Lots of small fixes and enhancements, most noteably: - Many TLB and cache flush optimizations (Dave) - Fixed HPMC/crash handler on 64-bit kernel (Dave and myself) - Added alternative infrastructre. The kernel now live-patches itself for various situations, e.g. replace SMP code when running on one CPU only or drop cache flushes when system has no cache installed. - vmlinuz now contains a full copy of the compressed vmlinux file. This simplifies debugging the currently booted kernel. - Unused driver removal (Christoph) - Reduced warnings of Dino PCI bridge when running in qemu - Removed gcc version check (Masahiro)" * 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (23 commits) parisc: Retrieve and display the PDC PAT capabilities parisc: Optimze cache flush algorithms parisc: Remove pte_inserted define parisc: Add PDC PAT cell_info() and pd_get_pdc_revisions() functions parisc: Drop two instructions from pte lookup code parisc: Use zdep for shlw macro on PA1.1 and PA2.0 parisc: Add alternative coding infrastructure parisc: Include compressed vmlinux file in vmlinuz boot kernel extract-vmlinux: Check for uncompressed image as fallback parisc: Fix address in HPMC IVA parisc: Fix exported address of os_hpmc handler parisc: Fix map_pages() to not overwrite existing pte entries parisc: Purge TLB entries after updating page table entry and set page accessed flag in TLB handler parisc: Release spinlocks using ordered store parisc: Ratelimit dino stuck interrupt warnings parisc: dino: Utilize DINO_MASK_IRQ() macro parisc: Clean up crash header output parisc: Add SYSTEM_INFO and REGISTER TOC PAT functions parisc: Remove PTE load and fault check from L2_ptep macro parisc: Reorder TLB flush timing calculation ...
\| * \|	parisc: Retrieve and display the PDC PAT capabilities	Helge Deller	2018-10-20	2	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Optimze cache flush algorithms	John David Anglin	2018-10-20	2	-20/+229
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The attached patch implements three optimizations: 1) Loops in flush_user_dcache_range_asm, flush_kernel_dcache_range_asm, purge_kernel_dcache_range_asm, flush_user_icache_range_asm, and flush_kernel_icache_range_asm are unrolled to reduce branch overhead. 2) The static branch prediction for cmpb instructions in pacache.S have been reviewed and the operand order adjusted where necessary. 3) For flush routines in cache.c, we purge rather flush when we have no context. The pdc instruction at level 0 is not required to write back dirty lines to memory. This provides a performance improvement over the fdc instruction if the feature is implemented. Version 2 adds alternative patching. The patch provides an average improvement of about 2%. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Remove pte_inserted define	John David Anglin	2018-10-20	1	-8/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The attached change removes the pte_inserted from pgtable.h. As a result, we always flush the TLB entry when the associated page table entry is changed. This change doesn't impact performance signifcantly and it may catch some cases where the TLB needs flushing but wasn't. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Add PDC PAT cell_info() and pd_get_pdc_revisions() functions	Helge Deller	2018-10-19	2	-13/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add wrappers for the PDC_PAT_CELL_GET_INFO and PDC_PAT_PD_GET_PDC_INTERF_REV PAT PDC subfunctions. Both provide access to the PAT capability bitfield which can guide us if simultaneous PTLBs are allowed on the bus, and if firmware will rendezvous all processors within PDCE_Check in case of an HPMC. Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Drop two instructions from pte lookup code	Helge Deller	2018-10-19	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove two instruction from the hot path. The temporary move to %r9 is unneccessary, and the zero-inialization of pte happens twice. Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Use zdep for shlw macro on PA1.1 and PA2.0	Helge Deller	2018-10-19	1	-8/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The zdep and depw,z mnemonics generate the same code. The assembler will accept the depw,z mnemonic when generating PA 1.x code. The zdep mnemonic is okay when generating PA 2.0 code. This patch changes depw,z to zdep in the current shlw macro, while the binary code will be the same. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: John David Anglin <dave.anglin@bell.net>
\| * \|	parisc: Add alternative coding infrastructure	Helge Deller	2018-10-17	14	-62/+233
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the necessary code to patch a running kernel at runtime to improve performance. The current implementation offers a few optimizations variants: - When running a SMP kernel on a single UP processor, unwanted assembler statements like locking functions are overwritten with NOPs. When multiple instructions shall be skipped, one branch instruction is used instead of multiple nop instructions. - In the UP case, some pdtlb and pitlb instructions are patched to become pdtlb,l and pitlb,l which only flushes the CPU-local tlb entries instead of broadcasting the flush to other CPUs in the system and thus may improve performance. - fic and fdc instructions are skipped if no I- or D-caches are installed. This should speed up qemu emulation and cacheless systems. - If no cache coherence is needed for IO operations, the relevant fdc and sync instructions in the sba and ccio drivers are replaced by nops. - On systems which share I- and D-TLBs and thus don't have a seperate instruction TLB, the pitlb instruction is replaced by a nop. Live-patching is done early in the boot process, just after having run the system inventory. No drivers are running and thus no external interrupts should arrive. So the hope is that no TLB exceptions will occur during the patching. If this turns out to be wrong we will probably need to do the patching in real-mode. Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Include compressed vmlinux file in vmlinuz boot kernel	Helge Deller	2018-10-17	4	-30/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the parisc vmlinuz boot code to include and process the real compressed vmlinux.gz ELF file instead of a compressed memory dump. This brings parisc in sync on how it's done on x86_64. The benefit of this change is that, e.g. for debugging purposes, one can then extract the vmlinux file out of the vmlinuz which was booted which wasn't possible before. This can be archieved with the existing scripts/extract-vmlinux script, which just needs a small tweak to prefer to extract a compressed file before trying the existing given binary. The downside of this approach is that due to the extra round of decompression/ELF processing we need more physical memory installed to be able to boot a kernel. Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	extract-vmlinux: Check for uncompressed image as fallback	Helge Deller	2018-10-17	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As on x86-64 and other architectures, the boot kernel on parisc (vmlinuz and bzImage) contains a full compressed copy of the final kernel executable (vmlinux.bin.gz), which one should be able to extract with the extract-vmlinux script. But on parisc extracting the kernel with extract-vmlinux fails. Currently the script first checks if the given file is an ELF file (which is true on parisc) and if so returns it. Thus on parisc we unexpectedly get back the vmlinuz boot file instead of the uncompressed vmlinux image. This patch fixes this issue by reverting the logic. It now first tries to find a compression signature in the given file and if that fails it checks the file itself as fallback. Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Fix address in HPMC IVA	John David Anglin	2018-10-17	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Helge noticed that the address of the os_hpmc handler was not being correctly calculated in the hpmc macro. As a result, PDCE_CHECK would fail to call os_hpmc: <Cpu2> e800009802e00000 0000000000000000 CC_ERR_CHECK_HPMC <Cpu2> 37000f7302e00000 8040004000000000 CC_ERR_CPU_CHECK_SUMMARY <Cpu2> f600105e02e00000 fffffff0f0c00000 CC_MC_HPMC_MONARCH_SELECTED <Cpu2> 140003b202e00000 000000000000000b CC_ERR_HPMC_STATE_ENTRY <Cpu2> 5600100b02e00000 00000000000001a0 CC_MC_OS_HPMC_LEN_ERR <Cpu2> 5600106402e00000 fffffff0f0438e70 CC_MC_BR_TO_OS_HPMC_FAILED <Cpu2> e800009802e00000 0000000000000000 CC_ERR_CHECK_HPMC <Cpu2> 37000f7302e00000 8040004000000000 CC_ERR_CPU_CHECK_SUMMARY <Cpu2> 4000109f02e00000 0000000000000000 CC_MC_HPMC_INITIATED <Cpu2> 4000101902e00000 0000000000000000 CC_MC_MULTIPLE_HPMCS <Cpu2> 030010d502e00000 0000000000000000 CC_CPU_STOP The address problem can be seen by dumping the fault vector: 0000000040159000 <fault_vector_20>: 40159000: 63 6f 77 73 stb r15,-2447(dp) 40159004: 20 63 61 6e ldil L%b747000,r3 40159008: 20 66 6c 79 ldil L%-1c3b3000,r3 ... 40159020: 08 00 02 40 nop 40159024: 20 6e 60 02 ldil L%15d000,r3 40159028: 34 63 00 00 ldo 0(r3),r3 4015902c: e8 60 c0 02 bv,n r0(r3) 40159030: 08 00 02 40 nop 40159034: 00 00 00 00 break 0,0 40159038: c0 00 70 00 bb,*< r0,sar,40159840 <fault_vector_20+0x840> 4015903c: 00 00 00 00 break 0,0 Location 40159038 should contain the physical address of os_hpmc: 000000004015d000 <os_hpmc>: 4015d000: 08 1a 02 43 copy r26,r3 4015d004: 01 c0 08 a4 mfctl iva,r4 4015d008: 48 85 00 68 ldw 34(r4),r5 This patch moves the address setup into initialize_ivt to resolve the above problem. I tested the change by dumping the HPMC entry after setup: 0000000040209020: 8000240 0000000040209024: 206a2004 0000000040209028: 34630ac0 000000004020902c: e860c002 0000000040209030: 8000240 0000000040209034: 1bdddce6 0000000040209038: 15d000 000000004020903c: 1a0 Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: <stable@vger.kernel.org> Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Fix exported address of os_hpmc handler	Helge Deller	2018-10-17	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the C-code we need to put the physical address of the hpmc handler in the interrupt vector table (IVA) in order to get HPMCs working. Since on parisc64 function pointers are indirect (in fact they are function descriptors) we instead export the address as variable and not as function. This reverts a small part of commit f39cce654f9a ("parisc: Add cfi_startproc and cfi_endproc to assembly code"). Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> [4.9+]
\| * \|	parisc: Fix map_pages() to not overwrite existing pte entries	Helge Deller	2018-10-17	1	-6/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a long-existing small nasty bug in the map_pages() implementation which leads to overwriting already written pte entries with zero, if map_pages() is called a second time with an end address which isn't aligned on a pmd boundry. This happens for example if we want to remap only the text segment read/write in order to run alternative patching on the code. Exiting the loop when we reach the end address fixes this. Cc: stable@vger.kernel.org Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Purge TLB entries after updating page table entry and set page ↵	John David Anglin	2018-10-17	2	-15/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	accessed flag in TLB handler This patch may resolve some races in TLB handling. Hopefully, TLB inserts are accesses and protected by spin lock. If not, we may need to IPI calls and do local purges on PA 2.0. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Release spinlocks using ordered store	John David Anglin	2018-10-17	2	-10/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch updates the spin unlock code to use an ordered store with release semanatics. All prior accesses are guaranteed to be performed before an ordered store is performed. Using an ordered store is significantly faster than using the sync memory barrier. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Ratelimit dino stuck interrupt warnings	Helge Deller	2018-10-17	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While playing with qemu with an emulated RT8139cp NIC, I faced lots of the following warnings: Dino 0x00810000: stuck interrupt 2 This patch ratelimits this warning and reports back that the IRQ was handled. Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: dino: Utilize DINO_MASK_IRQ() macro	Helge Deller	2018-10-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Clean up crash header output	Helge Deller	2018-10-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On kernel crash, this is the current output: Kernel Fault: Code=26 (Data memory access rights trap) regs=(ptrval) (Addr=00000004) Drop the address of regs, it's of no use for debugging, and show the faulty address without parenthesis. Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Add SYSTEM_INFO and REGISTER TOC PAT functions	Helge Deller	2018-10-17	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Remove PTE load and fault check from L2_ptep macro	John David Anglin	2018-10-17	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change removes the PTE load and present check from the L2_ptep macro. The load and check for kernel pages is now done in the tlb_lock macro. This avoids a double load and check for user pages. The load and check for user pages is now done inside the lock so the fault handler can't be called while the entry is being updated. This version uses an ordered store to release the lock when the page table entry isn't present. It also corrects the check in the non SMP case. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Reorder TLB flush timing calculation	John David Anglin	2018-10-17	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On boot (mostly reboot), my c8000 sometimes crashes after it prints the TLB flush threshold. The lockup is hard. The front LED flashes red and the box must be unplugged to reset the error. I noticed that when the crash occurs the TLB flush threshold is about one quarter what it is on a successful boot. If I disabled the calculation, the crash didn't occur. There also seemed to be a timing dependency affecting the crash. I finally realized that the flush_tlb_all() timing test runs just after the secondary CPUs are started. There seems to be a problem with running flush_tlb_all() too soon after the CPUs are started. The timing for the range test always seemed okay. So, I reversed the order of the two timing tests and I haven't had a crash at this point so far. I added a couple of information messages which I have left to help with diagnosis if the problem should appear on another machine. This version reduces the minimum TLB flush threshold to 16 KiB. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: remove the dead ccio-rm-dma driver	Christoph Hellwig	2018-10-17	2	-205/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This driver has never been wired up due to the life of the Linux git tree, and has severely bitrotted. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: remove check for minimum required GCC version	Masahiro Yamada	2018-10-17	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6") bumped the minimum GCC version to 4.6 for all architectures. The version check in arch/parisc/Makefile is obsolete now. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Helge Deller <deller@gmx.de>
\| * \|	parisc: Use PARISC_ITLB_TRAP constant in entry.S	Helge Deller	2018-10-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes: 5b00ca0b8035 ("parisc: Restore possibility to execute 64-bit applications") Signed-off-by: Helge Deller <deller@gmx.de>
* \| \|	Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm	Linus Torvalds	2018-10-23	18	-225/+203
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull ARM updates from Russell King: "The main item in this pull request are the Spectre variant 1.1 fixes from Julien Thierry. A few other patches to improve various areas, and removal of some obsolete mcount bits and a redundant kbuild conditional" * 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8802/1: Call syscall_trace_exit even when system call skipped ARM: 8797/1: spectre-v1.1: harden __copy_to_user ARM: 8796/1: spectre-v1,v1.1: provide helpers for address sanitization ARM: 8795/1: spectre-v1.1: use put_user() for __put_user() ARM: 8794/1: uaccess: Prevent speculative use of the current addr_limit ARM: 8793/1: signal: replace __put_user_error with __put_user ARM: 8792/1: oabi-compat: copy oabi events using __copy_to_user() ARM: 8791/1: vfp: use __copy_to_user() when saving VFP state ARM: 8790/1: signal: always use __copy_to_user to save iwmmxt context ARM: 8789/1: signal: copy registers using __copy_to_user() ARM: 8801/1: makefile: use ARMv3M mode for RiscPC ARM: 8800/1: use choice for kernel unwinders ARM: 8798/1: remove unnecessary KBUILD_SRC ifeq conditional ARM: 8788/1: ftrace: remove old mcount support ARM: 8786/1: Debug kernel copy by printing
\| \| \ \
\| \| \ \
\| *-. \ \	Merge branches 'fixes', 'misc' and 'spectre' into for-next	Russell King	2018-10-10	18	-225/+203
\| \|\ \ \ \
\| \| \| * \| \|	ARM: 8797/1: spectre-v1.1: harden __copy_to_user	Julien Thierry	2018-10-05	2	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sanitize user pointer given to __copy_to_user, both for standard version and memcopy version of the user accessor. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
\| \| \| * \| \|	ARM: 8796/1: spectre-v1,v1.1: provide helpers for address sanitization	Julien Thierry	2018-10-05	3	-5/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce C and asm helpers to sanitize user address, taking the address range they target into account. Use asm helper for existing sanitization in __copy_from_user(). Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
\| \| \| * \| \|	ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()	Julien Thierry	2018-10-05	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When Spectre mitigation is required, __put_user() needs to include check_uaccess. This is already the case for put_user(), so just make __put_user() an alias of put_user(). Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
\| \| \| * \| \|	ARM: 8794/1: uaccess: Prevent speculative use of the current addr_limit	Julien Thierry	2018-10-05	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A mispredicted conditional call to set_fs could result in the wrong addr_limit being forwarded under speculation to a subsequent access_ok check, potentially forming part of a spectre-v1 attack using uaccess routines. This patch prevents this forwarding from taking place, but putting heavy barriers in set_fs after writing the addr_limit. Porting commit c2f0ad4fc089cff8 ("arm64: uaccess: Prevent speculative use of the current addr_limit"). Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
\| \| \| * \| \|	ARM: 8793/1: signal: replace __put_user_error with __put_user	Julien Thierry	2018-10-05	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With Spectre-v1.1 mitigations, __put_user_error is pointless. In an attempt to remove it, replace its references in frame setups with __put_user. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
\| \| \| * \| \|	ARM: 8792/1: oabi-compat: copy oabi events using __copy_to_user()	Julien Thierry	2018-10-05	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Copy events to user using __copy_to_user() rather than copy members of individually with __put_user_error(). This has the benefit of disabling/enabling PAN once per event intead of once per event member. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
\| \| \| * \| \|	ARM: 8791/1: vfp: use __copy_to_user() when saving VFP state	Julien Thierry	2018-10-05	3	-20/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use __copy_to_user() rather than __put_user_error() for individual members when saving VFP state. This has the benefit of disabling/enabling PAN once per copied struct intead of once per write. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
\| \| \| * \| \|	ARM: 8790/1: signal: always use __copy_to_user to save iwmmxt context	Julien Thierry	2018-10-05	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When setting a dummy iwmmxt context, create a local instance and use __copy_to_user both cases whether iwmmxt is being used or not. This has the benefit of disabling/enabling PAN once for the whole copy intead of once per write. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
\| \| \| * \| \|	ARM: 8789/1: signal: copy registers using __copy_to_user()	Julien Thierry	2018-10-05	1	-22/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When saving the ARM integer registers, use __copy_to_user() to copy them into user signal frame, rather than __put_user_error(). This has the benefit of disabling/enabling PAN once for the whole copy intead of once per write. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
\| \| * \| \| \|	ARM: 8802/1: Call syscall_trace_exit even when system call skipped	Timothy E Baldwin	2018-10-10	1	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On at least x86 and ARM64, and as documented in the ptrace man page a skipped system call will still cause a syscall exit ptrace stop. Previous to this commit 32-bit ARM did not, resulting in strace being confused when seccomp skips system calls. This change also impacts programs that use ptrace to skip system calls. Fixes: ad75b51459ae ("ARM: 7579/1: arch/allow a scno of -1 to not cause a SIGILL") Signed-off-by: Timothy E Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Signed-off-by: Eugene Syromyatnikov <evgsyr@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Kees Cook <keescook@chromium.org> Tested-by: Eugene Syromyatnikov <evgsyr@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
\| \| * \| \| \|	ARM: 8801/1: makefile: use ARMv3M mode for RiscPC	Jason A. Donenfeld	2018-10-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The purpose of CONFIG_CPU_32v3 is to avoid ldrh/strh on the RiscPC, which is pretty much an ARMv4 device, except its bus will choke on the half-words. The way to make the C compiler not output ldrh/strh is with -march=armv3, which doesn't support them in the ISA. However, this prevents certain cryptography code from working that uses instructions like umull. Fortunately there's also -march=armv3m that does support those, making it possible to continue assembling optimized cryptography routines for our beloved RiscPC. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
\| \| * \| \| \|	ARM: 8800/1: use choice for kernel unwinders	Stefan Agner	2018-10-04	2	-19/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While in theory multiple unwinders could be compiled in, it does not make sense in practise. Use a choice to make the unwinder selection mutually exclusive and mandatory. Already before this commit it has not been possible to deselect FRAME_POINTER. Remove the obsolete comment. Furthermore, to produce a meaningful backtrace with FRAME_POINTER enabled the kernel needs a specific function prologue: mov ip, sp stmfd sp!, {fp, ip, lr, pc} sub fp, ip, #4 To get to the required prologue gcc uses apcs and no-sched-prolog. This compiler options are not available on clang, and clang is not able to generate the required prologue. Make the FRAME_POINTER config symbol depending on !clang. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Stefan Agner <stefan@agner.ch> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
\| \| * \| \| \|	ARM: 8798/1: remove unnecessary KBUILD_SRC ifeq conditional	Masahiro Yamada	2018-09-19	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	You can always prefix machine/plat header search paths with $(srctree)/ because $(srctree) is '.' for in-tree building. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>