summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'irqchip-4.17' of ↵Thomas Gleixner2018-03-2916-142/+956
|\ | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates for 4.17 from Marc Zyngier: - New Qualcomm PDC irqchip - New Microsemi Ocelot irqchip - Suspend/resume support for some oddball GICv3 irqchip - Better GIC/GICv3 support for kexec - Various cleanups and fixes
| * irqchip/gic: Take lock when updating irq typeAniruddha Banerjee2018-03-291-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most MMIO GIC register accesses use a 1-hot bit scheme that avoids requiring any form of locking. This isn't true for the GICD_ICFGRn registers, which require a RMW sequence. Unfortunately, we seem to be missing a lock for these particular accesses, which could result in a race condition if changing the trigger type on any two interrupts within the same set of 16 interrupts (and thus controlled by the same CFGR register). Introduce a private lock in the GIC common comde for this particular case, making it cover both GIC implementations in one go. Cc: stable@vger.kernel.org Signed-off-by: Aniruddha Banerjee <aniruddhab@nvidia.com> [maz: updated changelog] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/gic: Update supports_deactivate static key to modern apiDavidlohr Bueso2018-03-282-21/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No changes in semantics -- key init is true; replace static_key_slow_dec with static_branch_disable static_key_true with static_branch_likely The first is because we never actually do any couterpart incs, thus there is really no reference counting semantics going on. Use the more proper static_branch_disable() construct. Also added a '_key' suffix to supports_deactivate, for better self documentation. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/gic-v3: Ensure GICR_CTLR.EnableLPI=0 is observed before enablingShanker Donthineni2018-03-232-14/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Booting with GICR_CTLR.EnableLPI=1 is usually a bad idea, and may result in subtle memory corruption. Detecting this is thus pretty important. On detecting that LPIs are still enabled, we taint the kernel (because we're not sure of anything anymore), and try to disable LPIs. This can fail, as implementations are allowed to implement GICR_CTLR.EnableLPI as a one-way enable, meaning the redistributors cannot be reprogrammed with new tables. Should this happen, we fail probing the redistributor and warn the user that things are pretty dire. Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> [maz: reworded changelog, minor comment and message changes] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip: Add a driver for the Microsemi Ocelot controllerAlexandre Belloni2018-03-223-0/+124
| | | | | | | | | | | | | | | | The Microsemi Ocelot SoC has a pretty simple IRQ controller in its ICPU block. Add a driver for it. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * dt-bindings: interrupt-controller: Add binding for the Microsemi Ocelot ↵Alexandre Belloni2018-03-221-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | interrupt controller Add the Device Tree binding documentation for the Microsemi Ocelot interrupt controller that is part of the ICPU. It is connected directly to the MIPS core interrupt controller. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/gic-v3: Probe for SCR_EL3 being clear before resetting AP0RnMarc Zyngier2018-03-223-16/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We would like to reset the Group-0 Active Priority Registers at boot time if they are available to us. They would be available if SCR_EL3.FIQ was not set, but we cannot directly probe this bit, and short of checking, we may end-up trapping to EL3, and the firmware may not be please to get such an exception. Yes, this is dumb. Instead, let's use PMR to find out if its value gets affected by SCR_EL3.FIQ being set. We use the fact that when SCR_EL3.FIQ is set, the LSB of the priority is lost due to the shifting back and forth of the actual priority. If we read back a 0, we know that Group0 is unavailable. In case we read a non-zero value, we can safely reset the AP0Rn register. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/gic-v3: Don't try to reset AP0RnMarc Zyngier2018-03-201-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Clearing AP0Rn has created a number of regressions, due to systems that have SCR_EL3.FIQ set. Even when addressing some obvious bugs, GIC500 platforms seem to act bizarrely (we are supposed to have 5 bits of priority, but PMR seems to behave as if we had 6...). Drop the AP0Rn reset for the time being, it is unlikely to have any effect if kexec-ing. Fixes: d6062a6d62c6 irqchip/gic-v3: Reset APgRn registers at boot time Reported-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/gic-v3: Do not check trigger configuration of partitionned LPIsMarc Zyngier2018-03-201-3/+10
| | | | | | | | | | | | | | | | | | We cannot know the trigger of partitionned PPIs ahead of time (when we instanciate the partitions), so let's not check them early. Reported-by: JeffyChen <jeffy.chen@rock-chips.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/gic-v3: Loudly complain about the use of IRQ_TYPE_NONEMarc Zyngier2018-03-161-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | There is a huge number of broken device trees out there. Just grepping through the tree for the use of IRQ_TYPE_NONE in conjunction with the GIC is scary. People just don't realise that IRQ_TYPE_NONE just doesn't exist, and you just get whatever junk was there before. So let's make them aware of the issue. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/gic: Loudly complain about the use of IRQ_TYPE_NONEMarc Zyngier2018-03-161-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | There is a huge number of broken device trees out there. Just grepping through the tree for the use of IRQ_TYPE_NONE in conjunction with the GIC is scary. People just don't realise that IRQ_TYPE_NONE just doesn't exist, and you just get whatever junk was there before. So let's make them aware of the issue. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/gic-v3-its: Add ability to resend MAPC on resumeDerek Basehore2018-03-141-38/+48
| | | | | | | | | | | | | | | | | | | | | | This adds functionality to resend the MAPC command to an ITS node on resume. If the ITS is powered down during suspend and the collections are not backed by memory, the ITS will lose that state. This just sets up the known state for the collections after the ITS is restored. Signed-off-by: Derek Basehore <dbasehore@chromium.org> Reviewed-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/gic-v3-its: Add ability to save/restore ITS stateDerek Basehore2018-03-142-1/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some platforms power off GIC logic in suspend, so we need to save/restore state. The distributor and redistributor registers need to be handled in firmware code due to access permissions on those registers, but the ITS registers can be restored in the kernel. We limit this to systems where the ITS collections are implemented in HW (as opposed to being backed by memory tables), as they are the only ones that cannot be dealt with by the firmware. Signed-off-by: Derek Basehore <dbasehore@chromium.org> [maz: fixed changelog, dropped DT property, limited to HCC being >0] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/gic-v3: Allow LPIs to be disabled from the command lineMarc Zyngier2018-03-142-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For most GICv3 implementations, enabling LPIs is a one way switch. Once they're on, there is no turning back, which completely kills kexec (pending tables will always be live, and we can't tell the secondary kernel where they are). This is really annoying if you plan to use Linux as a bootloader, as it pretty much guarantees that the secondary kernel won't be able to use MSIs, and may even see some memory corruption. Bad. A workaround for this unfortunate situation is to allow the kernel not to enable LPIs, even if the feature is present in the HW. This would allow Linux-as-a-bootloader to leave LPIs alone, and let the secondary kernel to do whatever it wants with them. Let's introduce a boolean "irqchip.gicv3_nolpi" command line option that serves that purpose. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/gic-v3: Reset APgRn registers at boot timeMarc Zyngier2018-03-142-10/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Booting a crash kernel while in an interrupt handler is likely to leave the Active Priority Registers with some state that is not relevant to the new kernel, and is likely to lead to erratic behaviours such as interrupts not firing as their priority is already active. As a sanity measure, wipe the APRs clean on startup. We make sure to wipe both group 0 and 1 registers in order to avoid any surprise. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/gic-v2: Reset APRn registers at boot timeMarc Zyngier2018-03-141-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | Booting a crash kernel while in an interrupt handler is likely to leave the Active Priority Registers with some state that is not relevant to the new kernel, and is likely to lead to erratic behaviours such as interrupts not firing as their priority is already active. As a sanity measure, wipe the APRs clean on startup. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * dt-bindings/interrupt-controller: pdc: Describe PDC device bindingArchana Sathyakumar2018-03-141-0/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | Add device binding documentation for the PDC Interrupt controller on QCOM SoC's like the SDM845. The interrupt-controller can be used to sense edge low interrupts and wakeup interrupts when the GIC is non-operational. Cc: devicetree@vger.kernel.org Signed-off-by: Archana Sathyakumar <asathyak@codeaurora.org> Signed-off-by: Lina Iyer <ilina@codeaurora.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/pdc: Add PDC interrupt controller for QCOM SoCsArchana Sathyakumar2018-03-143-0/+321
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Power Domain Controller (PDC) on QTI SoCs like SDM845 houses an interrupt controller along with other domain control functions to handle interrupt related functions like handle falling edge or active low which are not detected at the GIC and handle wakeup interrupts. The interrupt controller is on an always-on domain for the purpose of waking up the processor. Only a subset of the processor's interrupts are routed through the PDC to the GIC. The PDC powers on the processors' domain, when in low power mode and replays pending interrupts so the GIC may wake up the processor. Signed-off-by: Archana Sathyakumar <asathyak@codeaurora.org> Signed-off-by: Lina Iyer <ilina@codeaurora.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/renesas-irqc: Use wakeup_path i.s.o. explicit clock handlingGeert Uytterhoeven2018-03-141-14/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 6f46aedb9c85873b ("irqchip: renesas-irqc: Add wake-up support"), when an IRQ is used for wakeup, the INTC block's module clock is manually kept running during system suspend, to make sure the device stays active. However, this explicit clock handling is merely a workaround for a failure to properly communicate wakeup information to the device core. Instead, set the device's power.wakeup_path field, to indicate this device is part of the wakeup path. Depending on the PM Domain's active_wakeup configuration, the genpd core code will keep the device enabled (and the clock running) during system suspend when needed. This allows for the removal of all explicit clock handling code from the driver. Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * irqchip/renesas-intc-irqpin: Use wakeup_path i.s.o. explicit clock handlingGeert Uytterhoeven2018-03-141-24/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 705bc96c2c15313c ("irqchip: renesas-intc-irqpin: Add minimal runtime PM support"), when an IRQ is used for wakeup, the INTC block's module clock (if exists) is manually kept running during system suspend, to make sure the device stays active. However, this explicit clock handling is merely a workaround for a failure to properly communicate wakeup information to the device core. Instead, set the device's power.wakeup_path field, to indicate this device is part of the wakeup path. Depending on the PM Domain's active_wakeup configuration, the genpd core code will keep the device enabled (and the clock running) during system suspend when needed. This allows for the removal of all explicit clock handling code from the driver. Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* | genirq: Remove license boilerplate/referencesThomas Gleixner2018-03-202-13/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that SPDX identifiers are in place, remove the boilerplate or references. The change in timings.c has been acked by the author. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Link: https://lkml.kernel.org/r/20180314212030.668321222@linutronix.de
* | genirq: Add missing SPDX identifiersThomas Gleixner2018-03-2015-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add SPDX identifiers to files - which contain an explicit license boiler plate or reference - which do not contain a license reference and were not updated in the initial SPDX conversion because the license was deduced by the scanners via EXPORT_SYMBOL_GPL as GPL2.0 only. [ tglx: Moved adding identifiers from the patch which removes the references/boilerplate ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Link: https://lkml.kernel.org/r/20180314212030.668321222@linutronix.de
* | genirq/matrix: Cleanup SPDX identifierThomas Gleixner2018-03-201-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | Use the proper SPDX-Identifier format. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Link: https://lkml.kernel.org/r/20180314212030.492674761@linutronix.de
* | genirq: Cleanup top of file commentsThomas Gleixner2018-03-2012-32/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | Remove pointless references to the file name itself and condense the information so it wastes less space. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Link: https://lkml.kernel.org/r/20180314212030.412095827@linutronix.de
* | genirq: Pass desc to __irq_free instead of irq numberUwe Kleine König2018-03-201-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Given that irq_to_desc() is a radix_tree_lookup and the reverse operation is only a pointer dereference and that all callers of __free_irq already have the desc, pass the desc instead of the irq number. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: kernel@pengutronix.de Link: https://lkml.kernel.org/r/20180319105202.9794-1-u.kleine-koenig@pengutronix.de
* | RISC-V: Move to the new GENERIC_IRQ_MULTI_HANDLER handlerPalmer Dabbelt2018-03-144-17/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing mechanism for handling IRQs on RISC-V is pretty ugly: the irq entry code selects the handler via Kconfig dependencies. Use the new generic IRQ handling infastructure, which allows boot time registration of the low level entry handler. This does add an additional load to the interrupt latency, but there's a lot of tuning left to be done there on RISC-V so it's OK for now. Signed-off-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Stafford Horne <shorne@gmail.com> Cc: jonas@southpole.se Cc: catalin.marinas@arm.com Cc: Will Deacon <will.deacon@arm.com> Cc: linux@armlinux.org.uk Cc: stefan.kristiansson@saunalahti.fi Cc: openrisc@lists.librecores.org Cc: linux-riscv@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lkml.kernel.org/r/20180307235731.22627-3-palmer@sifive.com
* | genirq: Add CONFIG_GENERIC_IRQ_MULTI_HANDLERPalmer Dabbelt2018-03-143-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The arm multi irq handler registration mechanism has been copied into a handful of architectures, including arm64 and openrisc. RISC-V needs the same mechanism. Instead of adding yet another copy for RISC-V copy the arm implementation into the core code depending on a new Kconfig symbol: CONFIG_GENERIC_MULTI_IRQ_HANDLER. Subsequent patches will convert the various architectures. Signed-off-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: jonas@southpole.se Cc: catalin.marinas@arm.com Cc: Will Deacon <will.deacon@arm.com> Cc: linux@armlinux.org.uk Cc: stefan.kristiansson@saunalahti.fi Cc: openrisc@lists.librecores.org Cc: shorne@gmail.com Cc: linux-riscv@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lkml.kernel.org/r/20180307235731.22627-2-palmer@sifive.com
* | Merge branch 'linus' into irq/core to pick up dependencies.Thomas Gleixner2018-03-141245-8458/+13515
|\ \
| * \ Merge tag 'nfs-for-4.16-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2018-03-123-44/+54
| |\ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client bugfixes from Trond Myklebust: "Hightlights include the following stable fixes: - NFS: Fix an incorrect type in struct nfs_direct_req - pNFS: Prevent the layout header refcount going to zero in pnfs_roc() - NFS: Fix unstable write completion" * tag 'nfs-for-4.16-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: Fix unstable write completion pNFS: Prevent the layout header refcount going to zero in pnfs_roc() NFS: Fix an incorrect type in struct nfs_direct_req
| | * NFS: Fix unstable write completionTrond Myklebust2018-03-081-40/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do want to respect the FLUSH_SYNC argument to nfs_commit_inode() to ensure that all outstanding COMMIT requests to the inode in question are complete. Currently we may exit early from both nfs_commit_inode() and nfs_write_inode() even if there are COMMIT requests in flight, or unstable writes on the commit list. In order to get the right semantics w.r.t. sync_inode(), we don't need to have nfs_commit_inode() reset the inode dirty flags when called from nfs_wb_page() and/or nfs_wb_all(). We just need to ensure that nfs_write_inode() leaves them in the right state if there are outstanding commits, or stable pages. Reported-by: Scott Mayhew <smayhew@redhat.com> Fixes: dc4fd9ab01ab ("nfs: don't wait on commit in nfs_commit_inode()...") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| | * pNFS: Prevent the layout header refcount going to zero in pnfs_roc()Trond Myklebust2018-03-081-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that we hold a reference to the layout header when processing the pNFS return-on-close so that the refcount value does not inadvertently go to zero. Reported-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.10+ Tested-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
| | * NFS: Fix an incorrect type in struct nfs_direct_reqTrond Myklebust2018-03-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The start offset needs to be of type loff_t. Fixed: 5fadeb47dcc5c ("nfs: count DIO good bytes correctly with mirroring") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | Linux 4.16-rc5v4.16-rc5Linus Torvalds2018-03-121-1/+1
| | |
| * | Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2018-03-1117-182/+291
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/pti updates from Thomas Gleixner: "Yet another pile of melted spectrum related updates: - Drop native vsyscall support finally as it causes more trouble than benefit. - Make microcode loading more robust. There were a few issues especially related to late loading which are now surfacing because late loading of the IB* microcodes addressing spectre issues has become more widely used. - Simplify and robustify the syscall handling in the entry code - Prevent kprobes on the entry trampoline code which lead to kernel crashes when the probe hits before CR3 is updated - Don't check microcode versions when running on hypervisors as they are considered as lying anyway. - Fix the 32bit objtool build and a coment typo" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kprobes: Fix kernel crash when probing .entry_trampoline code x86/pti: Fix a comment typo x86/microcode: Synchronize late microcode loading x86/microcode: Request microcode on the BSP x86/microcode/intel: Look into the patch cache first x86/microcode: Do not upload microcode if CPUs are offline x86/microcode/intel: Writeback and invalidate caches before updating microcode x86/microcode/intel: Check microcode revision before updating sibling threads x86/microcode: Get rid of struct apply_microcode_ctx x86/spectre_v2: Don't check microcode versions when running under hypervisors x86/vsyscall/64: Drop "native" vsyscalls x86/entry/64/compat: Save one instruction in entry_INT80_compat() x86/entry: Do not special-case clone(2) in compat entry x86/syscalls: Use COMPAT_SYSCALL_DEFINEx() macros for x86-only compat syscalls x86/syscalls: Use proper syscall definition for sys_ioperm() x86/entry: Remove stale syscall prototype x86/syscalls/32: Simplify $entry == $compat entries objtool: Fix 32-bit build
| | * | x86/kprobes: Fix kernel crash when probing .entry_trampoline codeFrancis Deslauriers2018-03-093-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Disable the kprobe probing of the entry trampoline: .entry_trampoline is a code area that is used to ensure page table isolation between userspace and kernelspace. At the beginning of the execution of the trampoline, we load the kernel's CR3 register. This has the effect of enabling the translation of the kernel virtual addresses to physical addresses. Before this happens most kernel addresses can not be translated because the running process' CR3 is still used. If a kprobe is placed on the trampoline code before that change of the CR3 register happens the kernel crashes because int3 handling pages are not accessible. To fix this, add the .entry_trampoline section to the kprobe blacklist to prohibit the probing of code before all the kernel pages are accessible. Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: mathieu.desnoyers@efficios.com Cc: mhiramat@kernel.org Link: http://lkml.kernel.org/r/1520565492-4637-2-git-send-email-francis.deslauriers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | x86/pti: Fix a comment typoSeunghun Han2018-03-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | s/visinble/visible/ Signed-off-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1520397135-132809-1-git-send-email-kkamagui@gmail.com
| | * | x86/microcode: Synchronize late microcode loadingAshok Raj2018-03-081-26/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Original idea by Ashok, completely rewritten by Borislav. Before you read any further: the early loading method is still the preferred one and you should always do that. The following patch is improving the late loading mechanism for long running jobs and cloud use cases. Gather all cores and serialize the microcode update on them by doing it one-by-one to make the late update process as reliable as possible and avoid potential issues caused by the microcode update. [ Borislav: Rewrite completely. ] Co-developed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: https://lkml.kernel.org/r/20180228102846.13447-8-bp@alien8.de
| | * | x86/microcode: Request microcode on the BSPBorislav Petkov2018-03-081-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ... so that any newer version can land in the cache and can later be fished out by the application functions. Do that before grabbing the hotplug lock. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: https://lkml.kernel.org/r/20180228102846.13447-7-bp@alien8.de
| | * | x86/microcode/intel: Look into the patch cache firstBorislav Petkov2018-03-081-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cache might contain a newer patch - look in there first. A follow-on change will make sure newest patches are loaded into the cache of microcode patches. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: https://lkml.kernel.org/r/20180228102846.13447-6-bp@alien8.de
| | * | x86/microcode: Do not upload microcode if CPUs are offlineAshok Raj2018-03-081-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid loading microcode if any of the CPUs are offline, and issue a warning. Having different microcode revisions on the system at any time is outright dangerous. [ Borislav: Massage changelog. ] Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: http://lkml.kernel.org/r/1519352533-15992-4-git-send-email-ashok.raj@intel.com Link: https://lkml.kernel.org/r/20180228102846.13447-5-bp@alien8.de
| | * | x86/microcode/intel: Writeback and invalidate caches before updating microcodeAshok Raj2018-03-081-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updating microcode is less error prone when caches have been flushed and depending on what exactly the microcode is updating. For example, some of the issues around certain Broadwell parts can be addressed by doing a full cache flush. [ Borislav: Massage it and use native_wbinvd() in both cases. ] Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: http://lkml.kernel.org/r/1519352533-15992-3-git-send-email-ashok.raj@intel.com Link: https://lkml.kernel.org/r/20180228102846.13447-4-bp@alien8.de
| | * | x86/microcode/intel: Check microcode revision before updating sibling threadsAshok Raj2018-03-081-3/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After updating microcode on one of the threads of a core, the other thread sibling automatically gets the update since the microcode resources on a hyperthreaded core are shared between the two threads. Check the microcode revision on the CPU before performing a microcode update and thus save us the WRMSR 0x79 because it is a particularly expensive operation. [ Borislav: Massage changelog and coding style. ] Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: http://lkml.kernel.org/r/1519352533-15992-2-git-send-email-ashok.raj@intel.com Link: https://lkml.kernel.org/r/20180228102846.13447-3-bp@alien8.de
| | * | x86/microcode: Get rid of struct apply_microcode_ctxBorislav Petkov2018-03-081-11/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is a useless remnant from earlier times. Use the ucode_state enum directly. No functional change. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: https://lkml.kernel.org/r/20180228102846.13447-2-bp@alien8.de
| | * | x86/spectre_v2: Don't check microcode versions when running under hypervisorsKonrad Rzeszutek Wilk2018-03-081-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As: 1) It's known that hypervisors lie about the environment anyhow (host mismatch) 2) Even if the hypervisor (Xen, KVM, VMWare, etc) provided a valid "correct" value, it all gets to be very murky when migration happens (do you provide the "new" microcode of the machine?). And in reality the cloud vendors are the ones that should make sure that the microcode that is running is correct and we should just sing lalalala and trust them. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: kvm <kvm@vger.kernel.org> Cc: Krčmář <rkrcmar@redhat.com> Cc: Borislav Petkov <bp@alien8.de> CC: "H. Peter Anvin" <hpa@zytor.com> CC: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180226213019.GE9497@char.us.oracle.com
| | * | x86/vsyscall/64: Drop "native" vsyscallsAndy Lutomirski2018-03-084-30/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since Linux v3.2, vsyscalls have been deprecated and slow. From v3.2 on, Linux had three vsyscall modes: "native", "emulate", and "none". "emulate" is the default. All known user programs work correctly in emulate mode, but vsyscalls turn into page faults and are emulated. This is very slow. In "native" mode, the vsyscall page is easily usable as an exploit gadget, but vsyscalls are a bit faster -- they turn into normal syscalls. (This is in contrast to vDSO functions, which can be much faster than syscalls.) In "none" mode, there are no vsyscalls. For all practical purposes, "native" was really just a chicken bit in case something went wrong with the emulation. It's been over six years, and nothing has gone wrong. Delete it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/519fee5268faea09ae550776ce969fa6e88668b0.1520449896.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | x86/entry/64/compat: Save one instruction in entry_INT80_compat()Dominik Brodowski2018-03-071-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As %rdi is never user except in the following push, there is no need to restore %rdi to the original value. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | x86/entry: Do not special-case clone(2) in compat entryDominik Brodowski2018-03-074-13/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the CPU renaming registers on its own, and all the overhead of the syscall entry/exit, it is doubtful whether the compiled output of mov %r8, %rax mov %rcx, %r8 mov %rax, %rcx jmpq sys_clone is measurably slower than the hand-crafted version of xchg %r8, %rcx So get rid of this special case. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | x86/syscalls: Use COMPAT_SYSCALL_DEFINEx() macros for x86-only compat syscallsDominik Brodowski2018-03-073-62/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While at it, convert declarations of type "unsigned" to "unsigned int". Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | x86/syscalls: Use proper syscall definition for sys_ioperm()Dominik Brodowski2018-03-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using SYSCALL_DEFINEx() is recommended, so use it also here. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | x86/entry: Remove stale syscall prototypeDominik Brodowski2018-03-071-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sys32_vm86_warning() is long gone. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>