summaryrefslogtreecommitdiffstats
path: root/arch (follow)
Commit message (Collapse)AuthorAgeFilesLines
* arm64: Disable SCS for hypervisor codeSami Tolvanen2020-05-151-1/+1
| | | | | | | | | | | Disable SCS for code that runs at a different exception level by adding __noscs to __hyp_text. Suggested-by: James Morse <james.morse@arm.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
* arm64: vdso: Disable Shadow Call StackSami Tolvanen2020-05-151-1/+1
| | | | | | | | | | | | Shadow stacks are only available in the kernel, so disable SCS instrumentation for the vDSO. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
* arm64: efi: Restore register x18 if it was corruptedSami Tolvanen2020-05-151-1/+10
| | | | | | | | | | | | If we detect a corrupted x18, restore the register before jumping back to potentially SCS instrumented code. This is safe, because the wrapper is called with preemption disabled and a separate shadow stack is used for interrupt handling. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
* arm64: Preserve register x18 when CPU is suspendedSami Tolvanen2020-05-152-1/+15
| | | | | | | | | | | Don't lose the current task's shadow stack when the CPU is suspended. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
* arm64: Reserve register x18 from general allocation with SCSSami Tolvanen2020-05-151-0/+4
| | | | | | | | | | | | | Reserve the x18 register from general allocation when SCS is enabled, because the compiler uses the register to store the current task's shadow stack pointer. Note that all external kernel modules must also be compiled with -ffixed-x18 if the kernel has SCS enabled. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
* scs: Disable when function graph tracing is enabledSami Tolvanen2020-05-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The graph tracer hooks returns by modifying frame records on the (regular) stack, but with SCS the return address is taken from the shadow stack, and the value in the frame record has no effect. As we don't currently have a mechanism to determine the corresponding slot on the shadow stack (and to pass this through the ftrace infrastructure), for now let's disable SCS when the graph tracer is enabled. With SCS the return address is taken from the shadow stack and the value in the frame record has no effect. The mcount based graph tracer hooks returns by modifying frame records on the (regular) stack, and thus is not compatible. The patchable-function-entry graph tracer used for DYNAMIC_FTRACE_WITH_REGS modifies the LR before it is saved to the shadow stack, and is compatible. Modifying the mcount based graph tracer to work with SCS would require a mechanism to determine the corresponding slot on the shadow stack (and to pass this through the ftrace infrastructure), and we expect that everyone will eventually move to the patchable-function-entry based graph tracer anyway, so for now let's disable SCS when the mcount-based graph tracer is enabled. SCS and patchable-function-entry are both supported from LLVM 10.x. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
* scs: Add support for Clang's Shadow Call Stack (SCS)Sami Tolvanen2020-05-151-0/+24
| | | | | | | | | | | | | | | | | | | | This change adds generic support for Clang's Shadow Call Stack, which uses a shadow stack to protect return addresses from being overwritten by an attacker. Details are available here: https://clang.llvm.org/docs/ShadowCallStack.html Note that security guarantees in the kernel differ from the ones documented for user space. The kernel must store addresses of shadow stacks in memory, which means an attacker capable reading and writing arbitrary memory may be able to locate them and hijack control flow by modifying the stacks. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> [will: Numerous cosmetic changes] Signed-off-by: Will Deacon <will@kernel.org>
* Merge tag 's390-5.7-3' of ↵Linus Torvalds2020-04-266-9/+9
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Add a few notrace annotations to avoid potential crashes when switching ftrace tracers. - Avoid setting affinity for floating irqs in pci code. - Fix build issue found by kbuild test robot. * tag 's390-5.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/protvirt: fix compilation issue s390/pci: do not set affinity for floating irqs s390/ftrace: fix potential crashes when switching tracers
| * s390/protvirt: fix compilation issueClaudio Imbrenda2020-04-252-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel fails to compile with CONFIG_PROTECTED_VIRTUALIZATION_GUEST set but CONFIG_KVM unset. This patch fixes the issue by making the needed variable always available. Link: https://lkml.kernel.org/r/20200423120114.2027410-1-imbrenda@linux.ibm.com Fixes: a0f60f843199 ("s390/protvirt: Add sysfs firmware interface for Ultravisor information") Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Philipp Rudo <prudo@linux.ibm.com> Suggested-by: Philipp Rudo <prudo@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
| * s390/pci: do not set affinity for floating irqsNiklas Schnelle2020-04-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the introduction of CPU directed interrupts the kernel parameter pci=force_floating was introduced to fall back to the previous behavior using floating irqs. However we were still setting the affinity in that case, both in __irq_alloc_descs() and via the irq_set_affinity callback in struct irq_chip. For the former only set the affinity in the directed case. The latter is explicitly set in zpci_directed_irq_init() so we can just leave it unset for the floating case. Fixes: e979ce7bced2 ("s390/pci: provide support for CPU directed interrupts") Co-developed-by: Alexander Schmidt <alexs@linux.ibm.com> Signed-off-by: Alexander Schmidt <alexs@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
| * s390/ftrace: fix potential crashes when switching tracersPhilipp Rudo2020-04-223-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Switching tracers include instruction patching. To prevent that a instruction is patched while it's read the instruction patching is done in stop_machine 'context'. This also means that any function called during stop_machine must not be traced. Thus add 'notrace' to all functions called within stop_machine. Fixes: 1ec2772e0c3c ("s390/diag: add a statistic for diagnose calls") Fixes: 38f2c691a4b3 ("s390: improve wait logic of stop_machine") Fixes: 4ecf0a43e729 ("processor: get rid of cpu_relax_yield") Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* | Merge tag 'powerpc-5.7-3' of ↵Linus Torvalds2020-04-264-2/+7
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - One important fix for a bug in the way we find the cache-line size from the device tree, which was leading to the wrong size being reported to userspace on some platforms. - A fix for 8xx STRICT_KERNEL_RWX which was leaving TLB entries around leading to a window at boot when the strict mapping wasn't enforced. - A fix to enable our KUAP (kernel user access prevention) debugging on PPC32. - A build fix for clang in lib/mpi. Thanks to: Chris Packham, Christophe Leroy, Nathan Chancellor, Qian Cai. * tag 'powerpc-5.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: lib/mpi: Fix building for powerpc with clang powerpc/mm: Fix CONFIG_PPC_KUAP_DEBUG on PPC32 powerpc/8xx: Fix STRICT_KERNEL_RWX startup test failure powerpc/setup_64: Set cache-line-size based on cache-block-size
| * | powerpc/mm: Fix CONFIG_PPC_KUAP_DEBUG on PPC32Christophe Leroy2020-04-222-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CONFIG_PPC_KUAP_DEBUG is not selectable because it depends on PPC_32 which doesn't exists. Fixing it leads to a deadlock due to a vital register getting clobbered in _switch(). Change dependency to PPC32 and use r0 instead of r4 in _switch() Fixes: e2fb9f544431 ("powerpc/32: Prepare for Kernel Userspace Access Protection") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/540242f7d4573f7cdf1b3bf46bb35f743b2cd68f.1587124651.git.christophe.leroy@c-s.fr
| * | powerpc/8xx: Fix STRICT_KERNEL_RWX startup test failureChristophe Leroy2020-04-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | WRITE_RO lkdtm test works. But when selecting CONFIG_DEBUG_RODATA_TEST, the kernel reports rodata_test: test data was not read only This is because when rodata test runs, there are still old entries in TLB. Flush TLB after setting kernel pages RO or NX. Fixes: d5f17ee96447 ("powerpc/8xx: don't disable large TLBs with CONFIG_STRICT_KERNEL_RWX") Cc: stable@vger.kernel.org # v5.1+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/485caac75f195f18c11eb077b0031fdd2bb7fb9e.1587361039.git.christophe.leroy@c-s.fr
| * | powerpc/setup_64: Set cache-line-size based on cache-block-sizeChris Packham2020-04-211-0/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If {i,d}-cache-block-size is set and {i,d}-cache-line-size is not, use the block-size value for both. Per the devicetree spec cache-line-size is only needed if it differs from the block size. Originally the code would fallback from block size to line size. An error message was printed if both properties were missing. Later the code was refactored to use clearer names and logic but it inadvertently made line size a required property, meaning on systems without a line size property we fall back to the default from the cputable. On powernv (OPAL) platforms, since the introduction of device tree CPU features (5a61ef74f269 ("powerpc/64s: Support new device tree binding for discovering CPU features")), that has led to the wrong value being used, as the fallback value is incorrect for Power8/Power9 CPUs. The incorrect values flow through to the VDSO and also to the sysconf values, SC_LEVEL1_ICACHE_LINESIZE etc. Fixes: bd067f83b084 ("powerpc/64: Fix naming of cache block vs. cache line") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reported-by: Qian Cai <cai@lca.pw> [mpe: Add even more detail to change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200416221908.7886-1-chris.packham@alliedtelesis.co.nz
* | Merge tag 'sched-urgent-2020-04-25' of ↵Linus Torvalds2020-04-251-14/+33
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Misc fixes: - an uclamp accounting fix - three frequency invariance fixes and a readability improvement" * tag 'sched-urgent-2020-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Fix reset-on-fork from RT with uclamp x86, sched: Move check for CPU type to caller function x86, sched: Don't enable static key when starting secondary CPUs x86, sched: Account for CPUs with less than 4 cores in freq. invariance x86, sched: Bail out of frequency invariance if base frequency is unknown
| * | x86, sched: Move check for CPU type to caller functionGiovanni Gherdovich2020-04-221-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve readability of the function intel_set_max_freq_ratio() by moving the check for KNL CPUs there, together with checks for GLM and SKX. Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lkml.kernel.org/r/20200416054745.740-5-ggherdovich@suse.cz
| * | x86, sched: Don't enable static key when starting secondary CPUsPeter Zijlstra (Intel)2020-04-221-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The static key arch_scale_freq_key only needs to be enabled once (at boot). This change fixes a bug by which the key was enabled every time cpu0 is started, even as a secondary CPU during cpu hotplug. Secondary CPUs are started from the idle thread: setting a static key from there means acquiring a lock and may result in sleeping in the idle task, causing CPU lockup. Another consequence of this change is that init_counter_refs() is now called on each CPU correctly; previously the function on_each_cpu() was used, but it was called at boot when the only online cpu is cpu0. [ggherdovich@suse.cz: Tested and wrote changelog] Fixes: 1567c3e3467c ("x86, sched: Add support for frequency invariance") Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lkml.kernel.org/r/20200416054745.740-4-ggherdovich@suse.cz
| * | x86, sched: Account for CPUs with less than 4 cores in freq. invarianceGiovanni Gherdovich2020-04-221-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a CPU has less than 4 physical cores, MSR_TURBO_RATIO_LIMIT will rightfully report that the 4C turbo ratio is zero. In such cases, use the 1C turbo ratio instead for frequency invariance calculations. Fixes: 1567c3e3467c ("x86, sched: Add support for frequency invariance") Reported-by: Like Xu <like.xu@linux.intel.com> Reported-by: Neil Rickert <nwr10cst-oslnx@yahoo.com> Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com> Link: https://lkml.kernel.org/r/20200416054745.740-3-ggherdovich@suse.cz
| * | x86, sched: Bail out of frequency invariance if base frequency is unknownGiovanni Gherdovich2020-04-221-0/+9
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some hypervisors such as VMWare ESXi 5.5 advertise support for X86_FEATURE_APERFMPERF but then fill all MSR's with zeroes. In particular, MSR_PLATFORM_INFO set to zero tricks the code that wants to know the base clock frequency of the CPU (highest non-turbo frequency), producing a division by zero when computing the ratio turbo_freq/base_freq necessary for frequency invariant accounting. It is to be noted that even if MSR_PLATFORM_INFO contained the appropriate data, APERF and MPERF are constantly zero on ESXi 5.5, thus freq-invariance couldn't be done in principle (not that it would make a lot of sense in a VM anyway). The real problem is advertising X86_FEATURE_APERFMPERF. This appears to be fixed in more recent versions: ESXi 6.7 doesn't advertise that feature. Fixes: 1567c3e3467c ("x86, sched: Add support for frequency invariance") Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lkml.kernel.org/r/20200416054745.740-2-ggherdovich@suse.cz
* | Merge tag 'perf-urgent-2020-04-25' of ↵Linus Torvalds2020-04-251-0/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two changes: - fix exit event records - extend x86 PMU driver enumeration to add Intel Jasper Lake CPU support" * tag 'perf-urgent-2020-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: fix parent pid/tid in task exit events perf/x86/cstate: Add Jasper Lake CPU support
| * | perf/x86/cstate: Add Jasper Lake CPU supportHarry Pan2020-04-221-0/+1
| |/ | | | | | | | | | | | | | | | | | | The Jasper Lake processor is Tremont microarchitecture, reuse the glm_cstates table of Goldmont and Goldmont Plus to enable the C-states residency profiling. Signed-off-by: Harry Pan <harry.pan@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200402190658.1.Ic02e891daac41303aed1f2fc6c64f6110edd27bd@changeid
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds2020-04-252-9/+37
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Fix memory leak in netfilter flowtable, from Roi Dayan. 2) Ref-count leaks in netrom and tipc, from Xiyu Yang. 3) Fix warning when mptcp socket is never accepted before close, from Florian Westphal. 4) Missed locking in ovs_ct_exit(), from Tonghao Zhang. 5) Fix large delays during PTP synchornization in cxgb4, from Rahul Lakkireddy. 6) team_mode_get() can hang, from Taehee Yoo. 7) Need to use kvzalloc() when allocating fw tracer in mlx5 driver, from Niklas Schnelle. 8) Fix handling of bpf XADD on BTF memory, from Jann Horn. 9) Fix BPF_STX/BPF_B encoding in x86 bpf jit, from Luke Nelson. 10) Missing queue memory release in iwlwifi pcie code, from Johannes Berg. 11) Fix NULL deref in macvlan device event, from Taehee Yoo. 12) Initialize lan87xx phy correctly, from Yuiko Oshino. 13) Fix looping between VRF and XFRM lookups, from David Ahern. 14) etf packet scheduler assumes all sockets are full sockets, which is not necessarily true. From Eric Dumazet. 15) Fix mptcp data_fin handling in RX path, from Paolo Abeni. 16) fib_select_default() needs to handle nexthop objects, from David Ahern. 17) Use GFP_ATOMIC under spinlock in mac80211_hwsim, from Wei Yongjun. 18) vxlan and geneve use wrong nlattr array, from Sabrina Dubroca. 19) Correct rx/tx stats in bcmgenet driver, from Doug Berger. 20) BPF_LDX zero-extension is encoded improperly in x86_32 bpf jit, fix from Luke Nelson. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (100 commits) selftests/bpf: Fix a couple of broken test_btf cases tools/runqslower: Ensure own vmlinux.h is picked up first bpf: Make bpf_link_fops static bpftool: Respect the -d option in struct_ops cmd selftests/bpf: Add test for freplace program with expected_attach_type bpf: Propagate expected_attach_type when verifying freplace programs bpf: Fix leak in LINK_UPDATE and enforce empty old_prog_fd bpf, x86_32: Fix logic error in BPF_LDX zero-extension bpf, x86_32: Fix clobbering of dst for BPF_JSET bpf, x86_32: Fix incorrect encoding in BPF_LDX zero-extension bpf: Fix reStructuredText markup net: systemport: suppress warnings on failed Rx SKB allocations net: bcmgenet: suppress warnings on failed Rx SKB allocations macsec: avoid to set wrong mtu mac80211: sta_info: Add lockdep condition for RCU list usage mac80211: populate debugfs only after cfg80211 init net: bcmgenet: correct per TX/RX ring statistics net: meth: remove spurious copyright text net: phy: bcm84881: clear settings on link down chcr: Fix CPU hard lockup ...
| * | bpf, x86_32: Fix logic error in BPF_LDX zero-extensionWang YanQing2020-04-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When verifier_zext is true, we don't need to emit code for zero-extension. Fixes: 836256bf5f37 ("x32: bpf: eliminate zero extension code-gen") Signed-off-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200423050637.GA4029@udknight
| * | bpf, x86_32: Fix clobbering of dst for BPF_JSETLuke Nelson2020-04-251-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current JIT clobbers the destination register for BPF_JSET BPF_X and BPF_K by using "and" and "or" instructions. This is fine when the destination register is a temporary loaded from a register stored on the stack but not otherwise. This patch fixes the problem (for both BPF_K and BPF_X) by always loading the destination register into temporaries since BPF_JSET should not modify the destination register. This bug may not be currently triggerable as BPF_REG_AX is the only register not stored on the stack and the verifier uses it in a limited way. Fixes: 03f5781be2c7b ("bpf, x86_32: add eBPF JIT compiler for ia32") Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Wang YanQing <udknight@gmail.com> Link: https://lore.kernel.org/bpf/20200422173630.8351-2-luke.r.nels@gmail.com
| * | bpf, x86_32: Fix incorrect encoding in BPF_LDX zero-extensionLuke Nelson2020-04-251-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current JIT uses the following sequence to zero-extend into the upper 32 bits of the destination register for BPF_LDX BPF_{B,H,W}, when the destination register is not on the stack: EMIT3(0xC7, add_1reg(0xC0, dst_hi), 0); The problem is that C7 /0 encodes a MOV instruction that requires a 4-byte immediate; the current code emits only 1 byte of the immediate. This means that the first 3 bytes of the next instruction will be treated as the rest of the immediate, breaking the stream of instructions. This patch fixes the problem by instead emitting "xor dst_hi,dst_hi" to clear the upper 32 bits. This fixes the problem and is more efficient than using MOV to load a zero immediate. This bug may not be currently triggerable as BPF_REG_AX is the only register not stored on the stack and the verifier uses it in a limited way, and the verifier implements a zero-extension optimization. But the JIT should avoid emitting incorrect encodings regardless. Fixes: 03f5781be2c7b ("bpf, x86_32: add eBPF JIT compiler for ia32") Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com> Acked-by: Wang YanQing <udknight@gmail.com> Link: https://lore.kernel.org/bpf/20200422173630.8351-1-luke.r.nels@gmail.com
| * | bpf, x86: Fix encoding for lower 8-bit registers in BPF_STX BPF_BLuke Nelson2020-04-211-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes an encoding bug in emit_stx for BPF_B when the source register is BPF_REG_FP. The current implementation for BPF_STX BPF_B in emit_stx saves one REX byte when the operands can be encoded using Mod-R/M alone. The lower 8 bits of registers %rax, %rbx, %rcx, and %rdx can be accessed without using a REX prefix via %al, %bl, %cl, and %dl, respectively. Other registers, (e.g., %rsi, %rdi, %rbp, %rsp) require a REX prefix to use their 8-bit equivalents (%sil, %dil, %bpl, %spl). The current code checks if the source for BPF_STX BPF_B is BPF_REG_1 or BPF_REG_2 (which map to %rdi and %rsi), in which case it emits the required REX prefix. However, it misses the case when the source is BPF_REG_FP (mapped to %rbp). The result is that BPF_STX BPF_B with BPF_REG_FP as the source operand will read from register %ch instead of the correct %bpl. This patch fixes the problem by fixing and refactoring the check on which registers need the extra REX byte. Since no BPF registers map to %rsp, there is no need to handle %spl. Fixes: 622582786c9e0 ("net: filter: x86: internal BPF JIT") Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200418232655.23870-1-luke.r.nels@gmail.com
* | | Merge tag 'arm64-fixes' of ↵Linus Torvalds2020-04-241-3/+6
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Ensure context synchronisation after a write to APIAKey. - Fix bullet list formatting in Documentation/arm64/amu.rst to eliminate doc warnings. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: Documentation: arm64: fix amu.rst doc warnings arm64: sync kernel APIAKey when installing
| * | | arm64: sync kernel APIAKey when installingMark Rutland2020-04-211-3/+6
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A direct write to a APxxKey_EL1 register requires a context synchronization event to ensure that indirect reads made by subsequent instructions (e.g. AUTIASP, PACIASP) observe the new value. When we initialize the boot task's APIAKey in boot_init_stack_canary() via ptrauth_keys_switch_kernel() we miss the necessary ISB, and so there is a window where instructions are not guaranteed to use the new APIAKey value. This has been observed to result in boot-time crashes where PACIASP and AUTIASP within a function used a mixture of the old and new key values. Fix this by having ptrauth_keys_switch_kernel() synchronize the new key value with an ISB. At the same time, __ptrauth_key_install() is renamed to __ptrauth_key_install_nosync() so that it is obvious that this performs no synchronization itself. Fixes: 28321582334c261c ("arm64: initialize ptrauth keys for kernel booting task") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Will Deacon <will@kernel.org> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Will Deacon <will@kernel.org>
* | | Merge tag 'kbuild-fixes-v5.7' of ↵Linus Torvalds2020-04-2423-222/+277
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - fix scripts/config to properly handle ':' in string type CONFIG options - fix unneeded rebuilds of DT schema check rule - git rid of ordering dependency between <linux/vermagic.h> and <linux/module.h> to fix build errors in some network drivers - clean up generated headers of host arch with 'make ARCH=um mrproper' * tag 'kbuild-fixes-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: h8300: ignore vmlinux.lds Documentation: kbuild: fix the section title format um: ensure `make ARCH=um mrproper` removes arch/$(SUBARCH)/include/generated/ arch: split MODULE_ARCH_VERMAGIC definitions out to <asm/vermagic.h> kbuild: fix DT binding schema rule again to avoid needless rebuilds scripts/config: allow colons in option strings for sed
| * | | h8300: ignore vmlinux.ldsMasahiro Yamada2020-04-231-0/+2
| | | | | | | | | | | | | | | | Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
| * | | um: ensure `make ARCH=um mrproper` removes arch/$(SUBARCH)/include/generated/Vitor Massaru Iha2020-04-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In this workflow: $ make ARCH=um defconfig && make ARCH=um -j8 [snip] $ make ARCH=um mrproper [snip] $ make ARCH=um defconfig O=./build_um && make ARCH=um -j8 O=./build_um [snip] CC scripts/mod/empty.o In file included from ../include/linux/types.h:6, from ../include/linux/mod_devicetable.h:12, from ../scripts/mod/devicetable-offsets.c:3: ../include/uapi/linux/types.h:5:10: fatal error: asm/types.h: No such file or directory 5 | #include <asm/types.h> | ^~~~~~~~~~~~~ compilation terminated. make[2]: *** [../scripts/Makefile.build:100: scripts/mod/devicetable-offsets.s] Error 1 make[2]: *** Waiting for unfinished jobs.... make[1]: *** [/home/iha/sdb/opensource/lkmp/linux-kselftest.git/Makefile:1140: prepare0] Error 2 make[1]: Leaving directory '/home/iha/sdb/opensource/lkmp/linux-kselftest.git/build_um' make: *** [Makefile:180: sub-make] Error 2 The cause of the error was because arch/$(SUBARCH)/include/generated files weren't properly cleaned by `make ARCH=um mrproper`. Fixes: a788b2ed81ab ("kbuild: check arch/$(SRCARCH)/include/generated before out-of-tree build") Reported-by: Theodore Ts'o <tytso@mit.edu> Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Vitor Massaru Iha <vitor@massaru.org> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Link: https://groups.google.com/forum/#!msg/kunit-dev/QmA27YEgEgI/hvS1kiz2CwAJ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
| * | | arch: split MODULE_ARCH_VERMAGIC definitions out to <asm/vermagic.h>Masahiro Yamada2020-04-2321-222/+274
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the bug report [1] pointed out, <linux/vermagic.h> must be included after <linux/module.h>. I believe we should not impose any include order restriction. We often sort include directives alphabetically, but it is just coding style convention. Technically, we can include header files in any order by making every header self-contained. Currently, arch-specific MODULE_ARCH_VERMAGIC is defined in <asm/module.h>, which is not included from <linux/vermagic.h>. Hence, the straight-forward fix-up would be as follows: |--- a/include/linux/vermagic.h |+++ b/include/linux/vermagic.h |@@ -1,5 +1,6 @@ | /* SPDX-License-Identifier: GPL-2.0 */ | #include <generated/utsrelease.h> |+#include <linux/module.h> | | /* Simply sanity version stamp for modules. */ | #ifdef CONFIG_SMP This works enough, but for further cleanups, I split MODULE_ARCH_VERMAGIC definitions into <asm/vermagic.h>. With this, <linux/module.h> and <linux/vermagic.h> will be orthogonal, and the location of MODULE_ARCH_VERMAGIC definitions will be consistent. For arc and ia64, MODULE_PROC_FAMILY is only used for defining MODULE_ARCH_VERMAGIC. I squashed it. For hexagon, nds32, and xtensa, I removed <asm/modules.h> entirely because they contained nothing but MODULE_ARCH_VERMAGIC definition. Kbuild will automatically generate <asm/modules.h> at build-time, wrapping <asm-generic/module.h>. [1] https://lore.kernel.org/lkml/20200411155623.GA22175@zn.tnic Reported-by: Borislav Petkov <bp@suse.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Jessica Yu <jeyu@kernel.org>
* | | Merge tag 'armsoc-fixes' of ↵Linus Torvalds2020-04-234-0/+11
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "A few smaller fixes for v5.7-rc3: The majority are fixes for bugs I found after restarting my randconfig build testing that had been dormant for a while. On the Nokia N950/N9 phone, a DT fix is required to address a boot regression. For the bcm283x (Raspberry Pi), two DT fixes address minor issues" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: soc: imx8: select SOC_BUS soc: tegra: fix tegra_pmc_get_suspend_mode definition soc: fsl: dpio: avoid stack usage warning soc: fsl: dpio: fix incorrect pointer conversions ARM: imx: provide v7_cpu_resume() only on ARM_CPU_SUSPEND=y ARM: dts: bcm283x: Disable dsi0 node firmware: xilinx: make firmware_debugfs_root static drivers: soc: xilinx: fix firmware driver Kconfig dependency ARM: dts: bcm283x: Add cells encoding format to firmware bus ARM: dts: OMAP3: disable RNG on N950/N9
| * \ \ Merge tag 'arm-soc/for-5.7/devicetree-fixes' of ↵Arnd Bergmann2020-04-232-0/+4
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://github.com/Broadcom/stblinux into arm/fixes This pull request contains Broadcom ARM-based SoCs Device Tree fixes for 5.7, please pull the following: - Nicolas provides a fix for 55c7c0621078 ("ARM: dts: bcm283x: Fix vc4's firmware bus DMA limitations") which missed adding proper #address-cells and #size-cells properties and he also disables the DSI node which should have been disabled by default but was not. * tag 'arm-soc/for-5.7/devicetree-fixes' of https://github.com/Broadcom/stblinux: ARM: dts: bcm283x: Disable dsi0 node ARM: dts: bcm283x: Add cells encoding format to firmware bus Link: https://lore.kernel.org/r/20200417171725.1084-1-f.fainelli@gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| | * | | ARM: dts: bcm283x: Disable dsi0 nodeNicolas Saenz Julienne2020-04-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since its inception the module was meant to be disabled by default, but the original commit failed to add the relevant property. Fixes: 4aba4cf82054 ("ARM: dts: bcm2835: Add the DSI module nodes and clocks") Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
| | * | | Merge tag 'tags/bcm2835-dt-fixes-2020-03-27' into devicetree/fixesFlorian Fainelli2020-04-141-0/+3
| | |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is to be squashed into 55c7c0621078 ("ARM: dts: bcm283x: Fix vc4's firmware bus DMA limitations") as it turned out to be faulty Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
| | | * | | ARM: dts: bcm283x: Add cells encoding format to firmware busNicolas Saenz Julienne2020-03-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the introduction of 55c7c0621078 ("ARM: dts: bcm283x: Fix vc4's firmware bus DMA limitations") the firmware bus has to comply with /soc's DMA limitations. Ultimately linking both buses to a same dma-ranges property. The patch (and author) missed the fact that a bus' #address-cells and #size-cells properties are not inherited, but set to a fixed value which, in this case, doesn't match /soc's. This, although not breaking Linux's DMA mapping functionality, generates ugly dtc warnings. Fix the issue by adding the correct address and size cells properties under the firmware bus. Fixes: 55c7c0621078 ("ARM: dts: bcm283x: Fix vc4's firmware bus DMA limitations") Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Link: https://lore.kernel.org/r/20200326134413.12298-1-nsaenzjulienne@suse.de
| * | | | | Merge tag 'omap-for-v5.6/fixes-rc7-signed' of ↵Arnd Bergmann2020-04-201-0/+5
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes Boot regression fix for N950/N9 We need to tag RNG as disabled for N950/N9 as it blocked by the secure mode. We have a similar change done for N900, but I missed adding it for N950/N9 with the recent RNG changes. * tag 'omap-for-v5.6/fixes-rc7-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: OMAP3: disable RNG on N950/N9 Link: https://lore.kernel.org/r/pull-1585340588-558327@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
| | * | | | | ARM: dts: OMAP3: disable RNG on N950/N9Aaro Koskinen2020-03-261-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like on N900, we cannot access RNG directly on N950/N9. Mark it disabled in the DTS to allow kernel to boot. Fixes: 308607e5545f ("ARM: dts: Configure omap3 rng") Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Tony Lindgren <tony@atomide.com>
| * | | | | | ARM: imx: provide v7_cpu_resume() only on ARM_CPU_SUSPEND=yAhmad Fatoum2020-04-171-0/+2
| | |/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 512a928affd5 ("ARM: imx: build v7_cpu_resume() unconditionally") introduced an unintended linker error for i.MX6 configurations that have ARM_CPU_SUSPEND=n which can happen if neither CONFIG_PM, CONFIG_CPU_IDLE, nor ARM_PSCI_FW are selected. Fix this by having v7_cpu_resume() compiled only when cpu_resume() it calls is available as well. The C declaration for the function remains unguarded to avoid future code inadvertently using a stub and introducing a regression to the bug the original commit fixed. Cc: <stable@vger.kernel.org> Fixes: 512a928affd5 ("ARM: imx: build v7_cpu_resume() unconditionally") Reported-by: Clemens Gruber <clemens.gruber@pqgruber.com> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Tested-by: Roland Hieber <rhi@pengutronix.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* | | | | | Merge branch 'akpm' (patches from Andrew)Linus Torvalds2020-04-212-2/+2
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge misc fixes from Andrew Morton: "15 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: tools/vm: fix cross-compile build coredump: fix null pointer dereference on coredump mm: shmem: disable interrupt when acquiring info->lock in userfaultfd_copy path shmem: fix possible deadlocks on shmlock_user_lock vmalloc: fix remap_vmalloc_range() bounds checks mm/shmem: fix build without THP mm/ksm: fix NULL pointer dereference when KSM zero page is enabled tools/build: tweak unused value workaround checkpatch: fix a typo in the regex for $allocFunctions mm, gup: return EINTR when gup is interrupted by fatal signals mm/hugetlb: fix a addressing exception caused by huge_pte_offset MAINTAINERS: add an entry for kfifo mm/userfaultfd: disable userfaultfd-wp on x86_32 slub: avoid redzone when choosing freepointer location sh: fix build error in mm/init.c
| * | | | | | mm/userfaultfd: disable userfaultfd-wp on x86_32Peter Xu2020-04-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Userfaultfd-wp is not yet working on 32bit hosts, but it's accidentally enabled previously. Disable it. Fixes: 5a281062af1d ("userfaultfd: wp: add WP pagetable tracking to x86") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Hillf Danton <hdanton@sina.com> Link: http://lkml.kernel.org/r/20200413141608.109211-1-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | sh: fix build error in mm/init.cMasahiro Yamada2020-04-211-1/+1
| | |_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The closing parenthesis is missing. Fixes: bfeb022f8fe4 ("mm/memory_hotplug: add pgprot_t to mhp_params") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Link: http://lkml.kernel.org/r/20200413014743.16353-1-masahiroy@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | Merge tag 'kvm-ppc-fixes-5.7-1' of ↵Paolo Bonzini2020-04-212-8/+10
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master PPC KVM fix for 5.7 - Fix a regression introduced in the last merge window, which results in guests in HPT mode dying randomly.
| * | | | | | KVM: PPC: Book3S HV: Handle non-present PTEs in page fault functionsPaul Mackerras2020-04-212-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since cd758a9b57ee "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT page fault handler", it's been possible in fairly rare circumstances to load a non-present PTE in kvmppc_book3s_hv_page_fault() when running a guest on a POWER8 host. Because that case wasn't checked for, we could misinterpret the non-present PTE as being a cache-inhibited PTE. That could mismatch with the corresponding hash PTE, which would cause the function to fail with -EFAULT a little further down. That would propagate up to the KVM_RUN ioctl() generally causing the KVM userspace (usually qemu) to fall over. This addresses the problem by catching that case and returning to the guest instead. For completeness, this fixes the radix page fault handler in the same way. For radix this didn't cause any obvious misbehaviour, because we ended up putting the non-present PTE into the guest's partition-scoped page tables, leading immediately to another hypervisor data/instruction storage interrupt, which would go through the page fault path again and fix things up. Fixes: cd758a9b57ee "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT page fault handler" Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1820402 Reported-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
* | | | | | | Merge tag 'kvm-s390-master-5.7-2' of ↵Paolo Bonzini2020-04-211224-15645/+34962
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master KVM: s390: Fix for 5.7 and maintainer update - Silence false positive lockdep warning - add Claudio as reviewer
| * | | | | | | KVM: s390: Fix PV check in deliverable_irqs()Eric Farman2020-04-201-1/+1
| | |/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The diag 0x44 handler, which handles a directed yield, goes into a a codepath that does a kvm_for_each_vcpu() and ultimately deliverable_irqs(). The new check for kvm_s390_pv_cpu_is_protected() contains an assertion that the vcpu->mutex is held, which isn't going to be the case in this scenario. The result is a plethora of these messages if the lock debugging is enabled, and thus an implication that we have a problem. WARNING: CPU: 9 PID: 16167 at arch/s390/kvm/kvm-s390.h:239 deliverable_irqs+0x1c6/0x1d0 [kvm] ...snip... Call Trace: [<000003ff80429bf2>] deliverable_irqs+0x1ca/0x1d0 [kvm] ([<000003ff80429b34>] deliverable_irqs+0x10c/0x1d0 [kvm]) [<000003ff8042ba82>] kvm_s390_vcpu_has_irq+0x2a/0xa8 [kvm] [<000003ff804101e2>] kvm_arch_dy_runnable+0x22/0x38 [kvm] [<000003ff80410284>] kvm_vcpu_on_spin+0x8c/0x1d0 [kvm] [<000003ff80436888>] kvm_s390_handle_diag+0x3b0/0x768 [kvm] [<000003ff80425af4>] kvm_handle_sie_intercept+0x1cc/0xcd0 [kvm] [<000003ff80422bb0>] __vcpu_run+0x7b8/0xfd0 [kvm] [<000003ff80423de6>] kvm_arch_vcpu_ioctl_run+0xee/0x3e0 [kvm] [<000003ff8040ccd8>] kvm_vcpu_ioctl+0x2c8/0x8d0 [kvm] [<00000001504ced06>] ksys_ioctl+0xae/0xe8 [<00000001504cedaa>] __s390x_sys_ioctl+0x2a/0x38 [<0000000150cb9034>] system_call+0xd8/0x2d8 2 locks held by CPU 2/KVM/16167: #0: 00000001951980c0 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x90/0x8d0 [kvm] #1: 000000019599c0f0 (&kvm->srcu){....}, at: __vcpu_run+0x4bc/0xfd0 [kvm] Last Breaking-Event-Address: [<000003ff80429b34>] deliverable_irqs+0x10c/0x1d0 [kvm] irq event stamp: 11967 hardirqs last enabled at (11975): [<00000001502992f2>] console_unlock+0x4ca/0x650 hardirqs last disabled at (11982): [<0000000150298ee8>] console_unlock+0xc0/0x650 softirqs last enabled at (7940): [<0000000150cba6ca>] __do_softirq+0x422/0x4d8 softirqs last disabled at (7929): [<00000001501cd688>] do_softirq_own_stack+0x70/0x80 Considering what's being done here, let's fix this by removing the mutex assertion rather than acquiring the mutex for every other vcpu. Fixes: 201ae986ead7 ("KVM: s390: protvirt: Implement interrupt injection") Signed-off-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20200415190353.63625-1-farman@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | | | | | Merge tag 'x86-urgent-2020-04-19' of ↵Linus Torvalds2020-04-196-21/+56
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 and objtool fixes from Thomas Gleixner: "A set of fixes for x86 and objtool: objtool: - Ignore the double UD2 which is emitted in BUG() when CONFIG_UBSAN_TRAP is enabled. - Support clang non-section symbols in objtool ORC dump - Fix switch table detection in .text.unlikely - Make the BP scratch register warning more robust. x86: - Increase microcode maximum patch size for AMD to cope with new CPUs which have a larger patch size. - Fix a crash in the resource control filesystem when the removal of the default resource group is attempted. - Preserve Code and Data Prioritization enabled state accross CPU hotplug. - Update split lock cpu matching to use the new X86_MATCH macros. - Change the split lock enumeration as Intel finaly decided that the IA32_CORE_CAPABILITIES bits are not architectural contrary to what the SDM claims. !@#%$^! - Add Tremont CPU models to the split lock detection cpu match. - Add a missing static attribute to make sparse happy" * tag 'x86-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/split_lock: Add Tremont family CPU models x86/split_lock: Bits in IA32_CORE_CAPABILITIES are not architectural x86/resctrl: Preserve CDP enable over CPU hotplug x86/resctrl: Fix invalid attempt at removing the default resource group x86/split_lock: Update to use X86_MATCH_INTEL_FAM6_MODEL() x86/umip: Make umip_insns static x86/microcode/AMD: Increase microcode PATCH_MAX_SIZE objtool: Make BP scratch register warning more robust objtool: Fix switch table detection in .text.unlikely objtool: Support Clang non-section symbols in ORC generation objtool: Support Clang non-section symbols in ORC dump objtool: Fix CONFIG_UBSAN_TRAP unreachable warnings
| | * | | | | | x86/split_lock: Add Tremont family CPU modelsTony Luck2020-04-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tremont CPUs support IA32_CORE_CAPABILITIES bits to indicate whether specific SKUs have support for split lock detection. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200416205754.21177-4-tony.luck@intel.com