summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* arm64: errata: Pass --fix-cortex-a53-843419 to ld if workaround enabledWill Deacon2016-08-262-10/+13
| | | | | | | | | | | | | Cortex-A53 erratum 843419 is worked around by the linker, although it is a configure-time option to GCC as to whether ld is actually asked to apply the workaround or not. This patch ensures that we pass --fix-cortex-a53-843419 to the linker when both CONFIG_ARM64_ERRATUM_843419=y and the linker supports the option. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
* Revert "arm64: hibernate: Refuse to hibernate if the boot cpu is offline"James Morse2016-08-261-26/+0
| | | | | | | | | | | | Now that we use the MPIDR to resume on the same CPU that we hibernated on, we no longer need to refuse to hibernate if the boot cpu is offline. (Which we can't possibly know if kexec causes logical CPUs to be renumbered). This reverts commit 1fe492ce6482b77807b25d29690a48c46456beee. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: hibernate: Resume when hibernate image created on non-boot CPUJames Morse2016-08-262-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | disable_nonboot_cpus() assumes that the lowest numbered online CPU is the boot CPU, and that this is the correct CPU to run any power management code on. On arm64 CPU0 can be taken offline. For hibernate/resume this means we may hibernate on a CPU other than CPU0. If the system is rebooted with kexec 'CPU0' will be assigned to a different CPU. This complicates hibernate/resume as now we can't trust the CPU numbers. We currently forbid hibernate if CPU0 has been hotplugged out to avoid this situation without kexec. Save the MPIDR of the CPU we hibernated on in the hibernate arch-header, use hibernate_resume_nonboot_cpu_disable() to direct which CPU we should resume on based on the MPIDR of the CPU we hibernated on. This allows us to hibernate/resume on any CPU, even if the logical numbers have been shuffled by kexec. Signed-off-by: James Morse <james.morse@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* cpu/hotplug: Allow suspend/resume CPU to be specifiedJames Morse2016-08-262-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | disable_nonboot_cpus() assumes that the lowest numbered online CPU is the boot CPU, and that this is the correct CPU to run any power management code on. On x86 this is always correct, as CPU0 cannot (easily) by taken offline. On arm64 CPU0 can be taken offline. For hibernate/resume this means we may hibernate on a CPU other than CPU0. If the system is rebooted with kexec 'CPU0' will be assigned to a different physical CPU. This complicates hibernate/resume as now we can't trust the CPU numbers. Arch code can find the correct physical CPU, and ensure it is online before resume from hibernate begins, but also needs to influence disable_nonboot_cpus()s choice of CPU. Rename disable_nonboot_cpus() as freeze_secondary_cpus() and add an argument indicating which CPU should be left standing. Follow the logic in migrate_to_reboot_cpu() to use the lowest numbered online CPU if the requested CPU is not online. Add disable_nonboot_cpus() as an inline function that has the existing behaviour. Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: always enable DEBUG_RODATA and remove the Kconfig optionMark Rutland2016-08-263-11/+4
| | | | | | | | | | | | | | | | | | | | | | | | Follow the example set by x86 in commit 9ccaf77cf05915f5 ("x86/mm: Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option"), and make these protections a fundamental security feature rather than an opt-in. This also results in a minor code simplification. For those rare cases when users wish to disable this protection (e.g. for debugging), this can be done by passing 'rodata=off' on the command line. As DEBUG_RODATA_ALIGN is only intended to address a performance/memory tradeoff, and does not affect correctness, this is left user-selectable. DEBUG_MODULE_RONX is also left user-selectable until the core code provides a boot-time option to disable the protection for debugging use-cases. Cc: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: mark reserved memblock regions explicitly in iomemAKASHI Takahiro2016-08-252-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kdump(kexec-tools) parses /proc/iomem to identify all the memory regions on the system. Since the current kernel names "nomap" regions, like UEFI runtime services code/data, as "System RAM," kexec-tools sets up elf core header to include them in a crash dump file (/proc/vmcore). Then crash dump kernel parses UEFI memory map again, re-marks those regions as "nomap" and does not create a memory mapping for them unlike the other areas of System RAM. In this case, copying /proc/vmcore through copy_oldmem_page() on crash dump kernel will end up with a kernel abort, as reported in [1]. This patch names all the "nomap" regions explicitly as "reserved" so that we can exclude them from a crash dump file. acpi_os_ioremap() must also be modified because those regions have WB attributes [2]. Apart from kdump, this change also matches x86's use of acpi (and /proc/iomem). [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-August/448186.html [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-August/450089.html Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: James Morse <james.morse@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: hibernate: Support DEBUG_PAGEALLOCJames Morse2016-08-254-11/+86
| | | | | | | | | | | | | | | | | | | | | | | | DEBUG_PAGEALLOC removes the valid bit of page table entries to prevent any access to unallocated memory. Hibernate uses this as a hint that those pages don't need to be saved/restored. This patch adds the kernel_page_present() function it uses. hibernate.c copies the resume kernel's linear map for use during restore. Add _copy_pte() to fill-in the holes made by DEBUG_PAGEALLOC in the resume kernel, so we can restore data the original kernel had at these addresses. Finally, DEBUG_PAGEALLOC means the linear-map alias of KERNEL_START to KERNEL_END may have holes in it, so we can't lazily clean this whole area to the PoC. Only clean the new mmuoff region, and the kernel/kvm idmaps. This reverts commit da24eb1f3f9e2c7b75c5f8c40d8e48e2c4789596. Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: vmlinux.ld: Add mmuoff data sections and move mmuoff text into idmapJames Morse2016-08-256-12/+42
| | | | | | | | | | | | | | | | | | | | | | | Resume from hibernate needs to clean any text executed by the kernel with the MMU off to the PoC. Collect these functions together into the .idmap.text section as all this code is tightly coupled and also needs the same cleaning after resume. Data is more complicated, secondary_holding_pen_release is written with the MMU on, clean and invalidated, then read with the MMU off. In contrast __boot_cpu_mode is written with the MMU off, the corresponding cache line is invalidated, so when we read it with the MMU on we don't get stale data. These cache maintenance operations conflict with each other if the values are within a Cache Writeback Granule (CWG) of each other. Collect the data into two sections .mmuoff.data.read and .mmuoff.data.write, the linker script ensures mmuoff.data.write section is aligned to the architectural maximum CWG of 2KB. Signed-off-by: James Morse <james.morse@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: Create sections.hJames Morse2016-08-257-28/+35
| | | | | | | | | | | | | | | Each time new section markers are added, kernel/vmlinux.ld.S is updated, and new extern char __start_foo[] definitions are scattered through the tree. Create asm/include/sections.h to collect these definitions (and include the existing asm-generic version). Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: Introduce execute-only page access permissionsCatalin Marinas2016-08-254-10/+15
| | | | | | | | | | | | | | | | | | | The ARMv8 architecture allows execute-only user permissions by clearing the PTE_UXN and PTE_USER bits. However, the kernel running on a CPU implementation without User Access Override (ARMv8.2 onwards) can still access such page, so execute-only page permission does not protect against read(2)/write(2) etc. accesses. Systems requiring such protection must enable features like SECCOMP. This patch changes the arm64 __P100 and __S100 protection_map[] macros to the new __PAGE_EXECONLY attributes. A side effect is that pte_user() no longer triggers for __PAGE_EXECONLY since PTE_USER isn't set. To work around this, the check is done on the PTE_NG bit via the pte_ng() macro. VM_READ is also checked now for page faults. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: kprobe: Always clear pstate.D in breakpoint exception handlerPratyush Anand2016-08-251-16/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whenever we are hitting a kprobe from a none-kprobe debug exception handler, we hit an infinite occurrences of "Unexpected kernel single-step exception at EL1" PSTATE.D is debug exception mask bit. It is set whenever we enter into an exception mode. When it is set then Watchpoint, Breakpoint, and Software Step exceptions are masked. However, software Breakpoint Instruction exceptions can never be masked. Therefore, if we ever execute a BRK instruction, irrespective of D-bit setting, we will be receiving a corresponding breakpoint exception. For example: - We are executing kprobe pre/post handler, and kprobe has been inserted in one of the instruction of a function called by handler. So, it executes BRK instruction and we land into the case of KPROBE_REENTER. (This case is already handled by current code) - We are executing uprobe handler or any other BRK handler such as in WARN_ON (BRK BUG_BRK_IMM), and we trace that path using kprobe.So, we enter into kprobe breakpoint handler,from another BRK handler.(This case is not being handled currently) In all such cases kprobe breakpoint exception will be raised when we were already in debug exception mode. SPSR's D bit (bit 9) shows the value of PSTATE.D immediately before the exception was taken. So, in above example cases we would find it set in kprobe breakpoint handler. Single step exception will always be followed by a kprobe breakpoint exception.However, it will only be raised gracefully if we clear D bit while returning from breakpoint exception. If D bit is set then, it results into undefined exception and when it's handler enables dbg then single step exception is generated, however it will never be handled(because address does not match and therefore treated as unexpected). This patch clears D-flag unconditionally in setup_singlestep, so that we can always get single step exception correctly after returning from breakpoint exception. Additionally, it also removes D-flag set statement for KPROBE_REENTER return path, because debug exception for KPROBE_REENTER will always take place in a debug exception state. So, D-flag will already be set in this case. Acked-by: Sandeepa Prabhu <sandeepa.s.prabhu@gmail.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: head.S: get rid of x25 and x26 with 'global' scopeArd Biesheuvel2016-08-222-17/+13
| | | | | | | | | | | | | | | | Currently, x25 and x26 hold the physical addresses of idmap_pg_dir and swapper_pg_dir, respectively, when running early boot code. But having registers with 'global' scope in files that contain different sections with different lifetimes, and that are called by different CPUs at different times is a bit messy, especially since stashing the values does not buy us anything in terms of code size or clarity. So simply replace each reference to x25 or x26 with an adrp instruction referring to idmap_pg_dir or swapper_pg_dir directly. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: apply __ro_after_init to some objectsJisheng Zhang2016-08-226-21/+26
| | | | | | | | | | | | | | | | | | | | These objects are set during initialization, thereafter are read only. Previously I only want to mark vdso_pages, vdso_spec, vectors_page and cpu_ops as __read_mostly from performance point of view. Then inspired by Kees's patch[1] to apply more __ro_after_init for arm, I think it's better to mark them as __ro_after_init. What's more, I find some more objects are also read only after init. So apply __ro_after_init to all of them. This patch also removes global vdso_pagelist and tries to clean up vdso_spec[] assignment code. [1] http://www.spinics.net/lists/arm-kernel/msg523188.html Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: vdso: constify vm_special_mapping used for aarch32 vectors pageJisheng Zhang2016-08-221-1/+1
| | | | | | | | | The vm_special_mapping spec which is used for aarch32 vectors page is never modified, so mark it as const. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: vdso: add __init section marker to alloc_vectors_pageJisheng Zhang2016-08-221-1/+1
| | | | | | | | | It is not needed after booting, this patch moves the alloc_vectors_page function to the __init section. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: remove redundant "select HAVE_CLK"Masahiro Yamada2016-08-221-1/+0
| | | | | | | | | HAVE_CLK is select'ed by CLKDEV_LOOKUP, which is select'ed by COMMON_CLK, which is select'ed by ARM64. No sub-architecture needs to select HAVE_CLK explicitly. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: remove traces of perf_ops_bpMark Rutland2016-08-221-2/+0
| | | | | | | | | | | | | Even though perf_ops_bp was removed/renamed back in commit b0a873ebbf87bf38 ("perf: Register PMU implementations"), as part of v2.6.37, its definition still lives on in some arch headers. This patch removes the vestigal definition from arm64. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: perf: Use the builtin_platform_driverKefeng Wang2016-08-221-5/+1
| | | | | | | Use the builtin_platform_driver() to simplify code. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: mm: convert __dma_* routines to use start, sizeKwangwoo Lee2016-08-223-47/+44
| | | | | | | | | | | | | | | | | | | | | | | | __dma_* routines have been converted to use start and size instread of start and end addresses. The patch was origianlly for adding __clean_dcache_area_poc() which will be used in pmem driver to clean dcache to the PoC(Point of Coherency) in arch_wb_cache_pmem(). The functionality of __clean_dcache_area_poc() was equivalent to __dma_clean_range(). The difference was __dma_clean_range() uses the end address, but __clean_dcache_area_poc() uses the size to clean. Thus, __clean_dcache_area_poc() has been revised with a fallthrough function of __dma_clean_range() after the change that __dma_* routines use start and size instead of using start and end. As a consequence of using start and size, the name of __dma_* routines has also been altered following the terminology below: area: takes a start and size range: takes a start and end Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: factor work_pending state machine to CChris Metcalf2016-08-222-18/+30
| | | | | | | | | | | | | | | | | | | Currently ret_fast_syscall, work_pending, and ret_to_user form an ad-hoc state machine that can be difficult to reason about due to duplicated code and a large number of branch targets. This patch factors the common logic out into the existing do_notify_resume function, converting the code to C in the process, making the code more legible. This patch tries to closely mirror the existing behaviour while using the usual C control flow primitives. As local_irq_{disable,enable} may be instrumented, we balance exception entry (where we will almost most likely enable IRQs) with a call to trace_hardirqs_on just before the return to userspace. Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* arm64: hibernate: reduce TLB maintenance scopeMark Rutland2016-08-221-2/+2
| | | | | | | | | | | | | | | | | | | | In break_before_make_ttbr_switch we perform broadcast TLB maintenance for the inner shareable domain, and use a DSB ISH to complete this. However, at the point we execute this, secondary CPUs are either physically offline, or executing code outside of the kernel. Upon entering the kernel, secondary CPUs will invalidate their TLBs before enabling their MMUs. Thus we do not need to invalidate TLBs of other CPUs, and as with idmap_cpu_replace_ttbr1 we can reduce the scope of maintenance to the TLBs of the local CPU. This keeps our TLB maintenance code consistent, and is a minor optimisation. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: James Morse <james.morse@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* Linux 4.8-rc3v4.8-rc3Linus Torvalds2016-08-221-1/+1
|
* Merge branch 'parisc-4.8-2' of ↵Linus Torvalds2016-08-213-22/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull two parisc fixes from Helge Deller: "The first patch ensures that the high-res cr16 clocksource (which was added in kernel 4.7) gets choosen as default clocksource for parisc. The second patch moves the #define of EREFUSED down inside errno.h and thus unbreaks building the gccgo compiler" * 'parisc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix order of EREFUSED define in errno.h parisc: Fix automatic selection of cr16 clocksource
| * parisc: Fix order of EREFUSED define in errno.hHelge Deller2016-08-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When building gccgo in userspace, errno.h gets parsed and the go include file sysinfo.go is generated. Since EREFUSED is defined to the same value as ECONNREFUSED, and ECONNREFUSED is defined later on in errno.h, this leads to go complaining that EREFUSED isn't defined yet. Fix this trivial problem by moving the define of EREFUSED down after ECONNREFUSED in errno.h (and clean up the indenting while touching this line). Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org
| * parisc: Fix automatic selection of cr16 clocksourceHelge Deller2016-08-202-20/+0
| | | | | | | | | | | | | | | | | | | | | | | | Commit 54b66800907 (parisc: Add native high-resolution sched_clock() implementation) added support to use the CPU-internal cr16 counters as reliable clocksource with the help of HAVE_UNSTABLE_SCHED_CLOCK. Sadly the commit missed to remove the hack which prevented cr16 to become the default clocksource even on SMP systems. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 4.7+
* | EDAC, skx_edac: Add EDAC driver for SkylakeTony Luck2016-08-214-0/+1136
|/ | | | | | | | | | | | | | | | | | | This is an entirely new driver instead of yet another set of patches to sb_edac.c because: 1) Mapping from PCI devices to socket/memory controller is significantly different. Skylake scatters devices on a socket across a number of PCI buses. 2) There is an extra level of interleaving via the "mcroute" register that would be a little messy to squeeze into the old driver. 3) Validation is getting too expensive. Changes to sb_edac need to be checked against Sandy Bridge, Ivy Bridge, Haswell, Broadwell and Knights Landing. Acked-by: Aristeu Rozanski <aris@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Make the hardened user-copy code depend on having a hardened allocatorLinus Torvalds2016-08-191-0/+1
| | | | | | | | | | | | | | | | | | | The kernel test robot reported a usercopy failure in the new hardened sanity checks, due to a page-crossing copy of the FPU state into the task structure. This happened because the kernel test robot was testing with SLOB, which doesn't actually do the required book-keeping for slab allocations, and as a result the hardening code didn't realize that the task struct allocation was one single allocation - and the sanity checks fail. Since SLOB doesn't even claim to support hardening (and you really shouldn't use it), the straightforward solution is to just make the usercopy hardening code depend on the allocator supporting it. Reported-by: kernel test robot <xiaolong.ye@intel.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'i2c/for-current' of ↵Linus Torvalds2016-08-198-22/+34
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "I2C has some pretty standard driver bugfixes and one minor cleanup" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: meson: Use complete() instead of complete_all() i2c: brcmstb: Use complete() instead of complete_all() i2c: bcm-kona: Use complete() instead of complete_all() i2c: bcm-iproc: Use complete() instead of complete_all() i2c: at91: fix support of the "alternative command" feature i2c: ocores: add missed clk_disable_unprepare() on failure paths i2c: cros-ec-tunnel: Fix usage of cros_ec_cmd_xfer() i2c: mux: demux-pinctrl: properly roll back when adding adapter fails
| * i2c: meson: Use complete() instead of complete_all()Daniel Wagner2016-08-151-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is only one waiter for the completion, therefore there is no need to use complete_all(). Let's make that clear by using complete() instead of complete_all(). The usage pattern of the completion is: meson_i2c_xfer_msg() reinit_completion() ... /* Start the transfer */ ... wait_for_completion_timeout() Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * i2c: brcmstb: Use complete() instead of complete_all()Daniel Wagner2016-08-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is only one waiter for the completion, therefore there is no need to use complete_all(). Let's make that clear by using complete() instead of complete_all(). The usage pattern of the completion is: brcmstb_send_i2c_cmd() reinit_completion() ... /* initiate transfer by setting iic_enable */ ... brcmstb_i2c_wait_for_completion() Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Reviewed-by: Kamal Dasu <kdasu.kdev@gmail.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * i2c: bcm-kona: Use complete() instead of complete_all()Daniel Wagner2016-08-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is only one waiter for the completion, therefore there is no need to use complete_all(). Let's make that clear by using complete() instead of complete_all(). The usage pattern of the completion is: bcm_kona_send_i2c_cmd() reinit_completion() ... bcm_kona_i2c_send_cmd_to_ctrl() ... wait_for_completion_timeout() Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Ray Jui <ray.jui@broadcom.com> Reviewed-by: Tim Kryger <tim.kryger@gmail.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * i2c: bcm-iproc: Use complete() instead of complete_all()Daniel Wagner2016-08-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is only one waiter for the completion, therefore there is no need to use complete_all(). Let's make that clear by using complete() instead of complete_all(). The usage pattern of the completion is: bcm_iproc_i2c_xfer_single_msg() reinit_completion() ... (activate the transfer) ... wait_for_completion_timeout() Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * i2c: at91: fix support of the "alternative command" featureCyrille Pitchen2016-08-151-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "alternative command" feature was introduced with sama5d2 SoCs. Its purpose is to let the hardware i2c controller automatically send the STOP condition on the i2c bus at the end of a data transfer. Without this feature, the i2c driver has to write the 'STOP' bit into the Control Register so the hardware i2c controller is triggered to send the STOP condition on the bus. Using the "alternative command" feature requires to set the transfer data length into the 8bit DATAL field of the Alternative Command Register. Hence only data transfers up to 255 bytes can take advantage of the "alternative command" feature. For greater data transfer sizes, the driver should use the previous implementation, when the "alternative command" support was not implemented yet. Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * i2c: ocores: add missed clk_disable_unprepare() on failure pathsAlexey Khoroshilov2016-08-151-4/+10
| | | | | | | | | | | | | | | | | | | | clk_disable_unprepare() is missed on failure paths in ocores_i2c_probe(). Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Acked-by: Peter Korsgaard <peter@korsgaard.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * i2c: cros-ec-tunnel: Fix usage of cros_ec_cmd_xfer()Brian Norris2016-08-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cros_ec_cmd_xfer returns success status if the command transport completes successfully, but the execution result is incorrectly ignored. In many cases, the execution result is assumed to be successful, leading to ignored errors and operating on uninitialized data. We've recently introduced the cros_ec_cmd_xfer_status() helper to avoid these problems. Let's use it. [Regarding the 'Fixes' tag; there is significant refactoring since the driver's introduction, but the underlying logical error exists throughout I believe] Fixes: 9d230c9e4f4e ("i2c: ChromeOS EC tunnel driver") Cc: <stable@vger.kernel.org> # 9798ac6d32c1 mfd: cros_ec: Add cros_ec_cmd_xfer_status() helper Signed-off-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * i2c: mux: demux-pinctrl: properly roll back when adding adapter failsWolfram Sang2016-08-151-1/+3
| | | | | | | | | | | | | | | | | | | | We also need to revert the dynamic OF change, so we get a consistent state again. Otherwise, we might have two devices enabled e.g. after pinctrl setup fails. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
* | Merge tag 'dm-4.8-fixes-2' of ↵Linus Torvalds2016-08-193-38/+53
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - a stable fix for DM round robin multipath path selector to disable preemption before using this_cpu_ptr() - a slight increase in DM crypt's mempool reserves to make swap ontop of DM crypt more performant - a few DM raid fixes to issues found while testing changes that were merged in v4.8-rc1 * tag 'dm-4.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm raid: support raid0 with missing metadata devices dm raid: enhance attempt_restore_of_faulty_devices() to support more devices dm raid: fix restoring of failed devices regression dm raid: fix frozen recovery regression dm crypt: increase mempool reserve to better support swapping dm round robin: do not use this_cpu_ptr() without having preemption disabled
| * | dm raid: support raid0 with missing metadata devicesHeinz Mauelshagen2016-08-171-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The raid0 MD personality does not start a raid0 array with any of its data devices missing. dm-raid was removing data/metadata device pairs unconditionally if it failed to read a superblock off the respective metadata device of such pair, resulting in failure to start arrays with the raid0 personality. Avoid removing any data/metadata device pairs in case of raid0 (e.g. lvm2 segment type 'raid0_meta') thus allowing MD to start the array. Also, avoid region size validation for raid0. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * | dm raid: enhance attempt_restore_of_faulty_devices() to support more devicesHeinz Mauelshagen2016-08-161-8/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | attempt_restore_of_faulty_devices() is limited to 64 when it should support the new maximum of 253 when identifying any failed devices. It clears any revivable devices via an MD personality hot remove and add cylce to allow for their recovery. Address by using existing functions to retrieve and update all failed devices' bitfield members in the dm raid superblocks on all RAID devices and check for any devices to clear in it. Whilst on it, don't call attempt_restore_of_faulty_devices() for any MD personality not providing disk hot add/remove methods (i.e. raid0 now), because such personalities don't support reviving of failed disks. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * | dm raid: fix restoring of failed devices regressionHeinz Mauelshagen2016-08-161-22/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'lvchange --refresh RaidLV' causes a mapped device suspend/resume cycle aiming at device restore and resync after transient device failures. This failed because flag RT_FLAG_RS_RESUMED was always cleared in the suspend path, thus the device restore wasn't performed in the resume path. Solve by removing RT_FLAG_RS_RESUMED from the suspend path and resume unconditionally. Also, remove superfluous comment from raid_resume(). Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * | dm raid: fix frozen recovery regressionHeinz Mauelshagen2016-08-161-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On LVM2 conversions via lvconvert(8), the target keeps mapped devices in frozen state when requesting RAID devices be resynchronized. This applies to e.g. adding legs to a raid1 device or taking over from raid0 to raid4 when the rebuild flag's set on the new raid1 legs or the added dedicated parity stripe. Also, fix frozen recovery for reshaping as well. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * | dm crypt: increase mempool reserve to better support swappingMikulas Patocka2016-08-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Increase mempool size from 16 to 64 entries. This increase improves swap on dm-crypt performance. When swapping to dm-crypt, all available memory is temporarily exhausted and dm-crypt can only use the mempool reserve. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * | dm round robin: do not use this_cpu_ptr() without having preemption disabledMike Snitzer2016-08-151-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use local_irq_save() to disable preemption before calling this_cpu_ptr(). Reported-by: Benjamin Block <bblock@linux.vnet.ibm.com> Fixes: b0b477c7e0dd ("dm round robin: use percpu 'repeat_count' and 'current_path'") Cc: stable@vger.kernel.org # 4.6+ Suggested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* | | Merge tag 'scsi-fixes' of ↵Linus Torvalds2016-08-196-19/+29
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Six fairly small fixes. The ipr, mpt3sas and ses ones all trigger oopses. The megaraid one fixes an attach failure on io mapped only cards, the fcoe one is an obvious problem in the error path and the aacraid one is a theoretical security issue (ability to trick the kernel into a buffer overrun)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: ses: Fix racy cleanup of /sys in remove_dev() mpt3sas: Fix resume on WarpDrive flash cards ipr: Fix sync scsi scan megaraid_sas: Fix probing cards without io port aacraid: Check size values after double-fetch from user fcoe: Use kfree_skb() instead of kfree()
| * \ \ Merge remote-tracking branch 'mkp-scsi/4.8/scsi-fixes' into fixesJames Bottomley2016-08-137-24/+35
| |\ \ \ | | |_|/ | |/| |
| | * | ses: Fix racy cleanup of /sys in remove_dev()Calvin Owens2016-08-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we free the resources backing the enclosure device before we call device_unregister(). This is racy: during rmmod of low-level SCSI drivers that hook into enclosure, we end up with a small window of time during which writing to /sys can OOPS. Example trace with mpt3sas: general protection fault: 0000 [#1] SMP KASAN Modules linked in: mpt3sas(-) <...> RIP: [<ffffffffa0388a98>] ses_get_page2_descriptor.isra.6+0x38/0x220 [ses] Call Trace: [<ffffffffa0389d14>] ses_set_fault+0xf4/0x400 [ses] [<ffffffffa0361069>] set_component_fault+0xa9/0xf0 [enclosure] [<ffffffff8205bffc>] dev_attr_store+0x3c/0x70 [<ffffffff81677df5>] sysfs_kf_write+0x115/0x180 [<ffffffff81675725>] kernfs_fop_write+0x275/0x3a0 [<ffffffff8151f810>] __vfs_write+0xe0/0x3e0 [<ffffffff8152281f>] vfs_write+0x13f/0x4a0 [<ffffffff81526731>] SyS_write+0x111/0x230 [<ffffffff828b401b>] entry_SYSCALL_64_fastpath+0x13/0x94 Fortunately the solution is extremely simple: call device_unregister() before we free the resources, and the race no longer exists. The driver core holds a reference over ->remove_dev(), so AFAICT this is safe. Signed-off-by: Calvin Owens <calvinowens@fb.com> Reviewed-by: James Bottomley <jejb@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * | mpt3sas: Fix resume on WarpDrive flash cardsGreg Edwards2016-08-121-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mpt3sas crashes on resume after suspend with WarpDrive flash cards. The reply_post_host_index array is not set back up after the resume, and we deference a stale pointer in _base_interrupt(). [ 47.309711] BUG: unable to handle kernel paging request at ffffc90001f8006c [ 47.318289] IP: [<ffffffffc00863ef>] _base_interrupt+0x49f/0xa30 [mpt3sas] [ 47.326749] PGD 41ccaa067 PUD 41ccab067 PMD 3466c067 PTE 0 [ 47.333848] Oops: 0002 [#1] SMP ... [ 47.452708] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.7.0 #6 [ 47.460506] Hardware name: Dell Inc. OptiPlex 990/06D7TR, BIOS A18 09/24/2013 [ 47.469629] task: ffffffff81c0d500 ti: ffffffff81c00000 task.ti: ffffffff81c00000 [ 47.479112] RIP: 0010:[<ffffffffc00863ef>] [<ffffffffc00863ef>] _base_interrupt+0x49f/0xa30 [mpt3sas] [ 47.490466] RSP: 0018:ffff88041d203e30 EFLAGS: 00010002 [ 47.497801] RAX: 0000000000000001 RBX: ffff880033f4c000 RCX: 0000000000000001 [ 47.506973] RDX: ffffc90001f8006c RSI: 0000000000000082 RDI: 0000000000000082 [ 47.516141] RBP: ffff88041d203eb0 R08: ffff8804118e2820 R09: 0000000000000001 [ 47.525300] R10: 0000000000000001 R11: 00000000100c0000 R12: 0000000000000000 [ 47.534457] R13: ffff880412c487e0 R14: ffff88041a8987d8 R15: 0000000000000001 [ 47.543632] FS: 0000000000000000(0000) GS:ffff88041d200000(0000) knlGS:0000000000000000 [ 47.553796] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.561632] CR2: ffffc90001f8006c CR3: 0000000001c06000 CR4: 00000000000406f0 [ 47.570883] Stack: [ 47.575015] 000000001d211228 ffff88041d2100c0 ffff8800c47d8130 0000000000000100 [ 47.584625] ffff8804100c0000 100c000000000000 ffff88041a8992a0 ffff88041a8987f8 [ 47.594230] ffff88041d203e00 ffffffff81111e55 000000000000038c ffff880414ad4280 [ 47.603862] Call Trace: [ 47.608474] <IRQ> [ 47.610413] [<ffffffff81111e55>] ? call_timer_fn+0x35/0x120 [ 47.620539] [<ffffffff81100a1f>] handle_irq_event_percpu+0x7f/0x1c0 [ 47.629061] [<ffffffff81100b8c>] handle_irq_event+0x2c/0x50 [ 47.636859] [<ffffffff81103fff>] handle_edge_irq+0x6f/0x130 [ 47.644654] [<ffffffff8102fbf3>] handle_irq+0x73/0x120 [ 47.652011] [<ffffffff810c6ada>] ? atomic_notifier_call_chain+0x1a/0x20 [ 47.660854] [<ffffffff817e374b>] do_IRQ+0x4b/0xd0 [ 47.667777] [<ffffffff817e160c>] common_interrupt+0x8c/0x8c [ 47.675635] <EOI> Move the reply_post_host_index array setup into mpt3sas_base_map_resources(), which is also in the resume path. Cc: stable@vger.kernel.org Signed-off-by: Greg Edwards <gedwards@fireweed.org> Acked-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * | ipr: Fix sync scsi scanBrian King2016-08-111-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit b195d5e2bffd ("ipr: Wait to do async scan until scsi host is initialized") fixed async scan for ipr, but broke sync scan. This fixes sync scan back up. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * | megaraid_sas: Fix probing cards without io portYinghai Lu2016-08-112-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Found one megaraid_sas HBA probe fails, [ 187.235190] scsi host2: Avago SAS based MegaRAID driver [ 191.112365] megaraid_sas 0000:89:00.0: BAR 0: can't reserve [io 0x0000-0x00ff] [ 191.120548] megaraid_sas 0000:89:00.0: IO memory region busy! and the card has resource like, [ 125.097714] pci 0000:89:00.0: [1000:005d] type 00 class 0x010400 [ 125.104446] pci 0000:89:00.0: reg 0x10: [io 0x0000-0x00ff] [ 125.110686] pci 0000:89:00.0: reg 0x14: [mem 0xce400000-0xce40ffff 64bit] [ 125.118286] pci 0000:89:00.0: reg 0x1c: [mem 0xce300000-0xce3fffff 64bit] [ 125.125891] pci 0000:89:00.0: reg 0x30: [mem 0xce200000-0xce2fffff pref] that does not io port resource allocated from BIOS, and kernel can not assign one as io port shortage. The driver is only looking for MEM, and should not fail. It turns out megasas_init_fw() etc are using bar index as mask. index 1 is used as mask 1, so that pci_request_selected_regions() is trying to request BAR0 instead of BAR1. Fix all related reference. Fixes: b6d5d8808b4c ("megaraid_sas: Use lowest memory bar for SR-IOV VF support") Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| | * | aacraid: Check size values after double-fetch from userDave Carroll2016-08-091-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In aacraid's ioctl_send_fib() we do two fetches from userspace, one the get the fib header's size and one for the fib itself. Later we use the size field from the second fetch to further process the fib. If for some reason the size from the second fetch is different than from the first fix, we may encounter an out-of- bounds access in aac_fib_send(). We also check the sender size to insure it is not out of bounds. This was reported in https://bugzilla.kernel.org/show_bug.cgi?id=116751 and was assigned CVE-2016-6480. Reported-by: Pengfei Wang <wpengfeinudt@gmail.com> Fixes: 7c00ffa31 '[SCSI] 2.6 aacraid: Variable FIB size (updated patch)' Cc: stable@vger.kernel.org Signed-off-by: Dave Carroll <david.carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>