summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* scripts/kallsyms: fix wrong kallsyms_relative_baseMikhail Petrov2020-03-191-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is the code in the read_symbol function in 'scripts/kallsyms.c': if (is_ignored_symbol(name, type)) return NULL; /* Ignore most absolute/undefined (?) symbols. */ if (strcmp(name, "_text") == 0) _text = addr; But the is_ignored_symbol function returns true for name="_text" and type='A'. So the next condition is not executed and the _text variable is always zero. It makes the wrong kallsyms_relative_base symbol as a result of the code (CONFIG_KALLSYMS_BASE_RELATIVE is defined): if (base_relative) { output_label("kallsyms_relative_base"); output_address(relative_base); printf("\n"); } Because the output_address function uses the _text variable. So the kallsyms_lookup function and all related functions in the kernel do not work properly. For example, the stack trace in oops: Call Trace: [aa095e58] [809feab8] kobj_ns_ops_tbl+0x7ff09ac8/0x7ff1c1c4 (unreliable) [aa095e98] [80002b64] kobj_ns_ops_tbl+0x7f50db74/0x80000010 [aa095ef8] [809c3d24] kobj_ns_ops_tbl+0x7feced34/0x7ff1c1c4 [aa095f28] [80002ed0] kobj_ns_ops_tbl+0x7f50dee0/0x80000010 [aa095f38] [8000f238] kobj_ns_ops_tbl+0x7f51a248/0x80000010 The right stack trace: Call Trace: [aa095e58] [809feab8] module_vdu_video_init+0x2fc/0x3bc (unreliable) [aa095e98] [80002b64] do_one_initcall+0x40/0x1f0 [aa095ef8] [809c3d24] kernel_init_freeable+0x164/0x1d8 [aa095f28] [80002ed0] kernel_init+0x14/0x124 [aa095f38] [8000f238] ret_from_kernel_thread+0x14/0x1c [masahiroy@kernel.org: This issue happens on binutils <= 2.22 The following commit fixed it: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d2667025dd30611514810c28bee9709e4623012a The symbol type of _text is 'T' on binutils >= 2.23 The minimal supported binutils version for the kernel build is 2.21 ] Signed-off-by: Mikhail Petrov <Mikhail.Petrov@mir.dev> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
* modpost: Get proper section index by get_secindex() instead of st_shndxXiao Yang2020-03-181-1/+2
| | | | | | | | | | | | | | | (uint16_t) st_shndx is limited to 65535(i.e. SHN_XINDEX) so sym_get_data() gets wrong section index by st_shndx if requested symbol contains extended section index that is more than 65535. In this case, we need to get proper section index by .symtab_shndx section. Module.symvers generated by building kernel with "-ffunction-sections -fdata-sections" shows the issue. Fixes: 56067812d5b0 ("kbuild: modversions: add infrastructure for emitting relative CRCs") Fixes: e84f9fbbece1 ("modpost: refactor namespace_from_kstrtabns() to not hard-code section name") Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
* initramfs: restore default compression behaviorEugeniy Paltsev2020-03-171-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | Even though INITRAMFS_SOURCE kconfig option isn't set in most of defconfigs it is used (set) extensively by various build systems. Commit f26661e12765 ("initramfs: make initramfs compression choice non-optional") has changed default compression mode. Previously we compress initramfs using available compression algorithm. Now we don't use any compression at all by default. It significantly increases the image size in case of build system chooses embedded initramfs. Initially I faced with this issue while using buildroot. As of today it's not possible to set preferred compression mode in target defconfig as this option depends on INITRAMFS_SOURCE being set. Modification of all build systems either doesn't look like good option. Let's instead rewrite initramfs compression mode choices list the way that "INITRAMFS_COMPRESSION_NONE" will be the last option in the list. In that case it will be chosen only if all other options (which implements any compression) are not available. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
* modpost: move the namespace field in Module.symvers lastJessica Yu2020-03-173-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to preserve backwards compatability with kmod tools, we have to move the namespace field in Module.symvers last, as the depmod -e -E option looks at the first three fields in Module.symvers to check symbol versions (and it's expected they stay in the original order of crc, symbol, module). In addition, update an ancient comment above read_dump() in modpost that suggested that the export type field in Module.symvers was optional. I suspect that there were historical reasons behind that comment that are no longer accurate. We have been unconditionally printing the export type since 2.6.18 (commit bd5cbcedf44), which is over a decade ago now. Fix up read_dump() to treat each field as non-optional. I suspect the original read_dump() code treated the export field as optional in order to support pre <= 2.6.18 Module.symvers (which did not have the export type field). Note that although symbol namespaces are optional, the field will not be omitted from Module.symvers if a symbol does not have a namespace. In this case, the field will simply be empty and the next delimiter or end of line will follow. Cc: stable@vger.kernel.org Fixes: cb9b55d21fe0 ("modpost: add support for symbol namespaces") Tested-by: Matthias Maennich <maennich@google.com> Reviewed-by: Matthias Maennich <maennich@google.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
* kbuild: Disable -Wpointer-to-enum-castNathan Chancellor2020-03-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Clang's -Wpointer-to-int-cast deviates from GCC in that it warns when casting to enums. The kernel does this in certain places, such as device tree matches to set the version of the device being used, which allows the kernel to avoid using a gigantic union. https://elixir.bootlin.com/linux/v5.5.8/source/drivers/ata/ahci_brcm.c#L428 https://elixir.bootlin.com/linux/v5.5.8/source/drivers/ata/ahci_brcm.c#L402 https://elixir.bootlin.com/linux/v5.5.8/source/include/linux/mod_devicetable.h#L264 To avoid a ton of false positive warnings, disable this particular part of the warning, which has been split off into a separate diagnostic so that the entire warning does not need to be turned off for clang. It will be visible under W=1 in case people want to go about fixing these easily and enabling the warning treewide. Cc: stable@vger.kernel.org Link: https://github.com/ClangBuiltLinux/linux/issues/887 Link: https://github.com/llvm/llvm-project/commit/2a41b31fcdfcb67ab7038fc2ffb606fd50b83a84 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
* kbuild: doc: fix references to other documentsMasahiro Yamada2020-03-133-5/+5
| | | | | | All the files in Documentation/kbuild/ were converted to reST. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
* int128: fix __uint128_t compiler test in KconfigMasahiro Yamada2020-03-111-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | The support for __uint128_t is dependent on the target bit size. GCC that defaults to the 32-bit can still build the 64-bit kernel with -m64 flag passed. However, $(cc-option,-D__SIZEOF_INT128__=0) is evaluated against the default machine bit, which may not match to the kernel it is building. Theoretically, this could be evaluated separately for 64BIT/32BIT. config CC_HAS_INT128 bool default !$(cc-option,$(m64-flag) -D__SIZEOF_INT128__=0) if 64BIT default !$(cc-option,$(m32-flag) -D__SIZEOF_INT128__=0) I simplified it more because the 32-bit compiler is unlikely to support __uint128_t. Fixes: c12d3362a74b ("int128: move __uint128_t compiler test to Kconfig") Reported-by: George Spelvin <lkml@sdf.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: George Spelvin <lkml@sdf.org>
* kconfig: introduce m32-flag and m64-flagMasahiro Yamada2020-03-111-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a compiler supports multiple architectures, some compiler features can be dependent on the target architecture. This is typical for Clang, which supports multiple LLVM backends. Even for GCC, we need to take care of biarch compiler cases. It is not a problem when we evaluate cc-option in Makefiles because cc-option is tested against the flag in question + $(KBUILD_CFLAGS). The cc-option in Kconfig, on the other hand, does not accumulate tested flags. Due to this simplification, it could potentially test cc-option against a different target. At first, Kconfig always evaluated cc-option against the host architecture. Since commit e8de12fb7cde ("kbuild: Check for unknown options with cc-option usage in Kconfig and clang"), in case of cross-compiling with Clang, the target triple is correctly passed to Kconfig. The case with biarch GCC (and native build with Clang) is still not handled properly. We need to pass some flags to specify the target machine bit. Due to the design, all the macros in Kconfig are expanded in the parse stage, where we do not know the target bit size yet. For example, arch/x86/Kconfig allows a user to toggle CONFIG_64BIT. If a compiler flag -foo depends on the machine bit, it must be tested twice, one with -m32 and the other with -m64. However, -m32/-m64 are not always recognized. So, this commits adds m64-flag and m32-flag macros. They expand to -m32, -m64, respectively if supported. Or, they expand to an empty string if unsupported. The typical usage is like this: config FOO bool default $(cc-option,$(m64-flag) -foo) if 64BIT default $(cc-option,$(m32-flag) -foo) This is clumsy, but there is no elegant way to handle this in the current static macro expansion. There was discussion for static functions vs dynamic functions. The consensus was to go as far as possible with the static functions. (https://lkml.org/lkml/2018/3/2/22) Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: George Spelvin <lkml@sdf.org> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
* kbuild: Fix inconsistent commentSZ Lin (林上智)2020-03-111-1/+1
| | | | | | | | | | The commit 2042b5486bd3 ("kbuild: unset variables in top Makefile instead of setting 0") renamed the variable from "config-targets" to "config-build", the comment should be consistent accordingly. Signed-off-by: Kaiden PK Yu (余泊鎧) <KaidenPK.Yu@moxa.com> Signed-off-by: SZ Lin (林上智) <sz.lin@moxa.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
* Linux 5.6-rc4v5.6-rc4Linus Torvalds2020-03-011-1/+1
|
* Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds2020-03-012-7/+7
|\ | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Two more bug fixes (including a regression) for 5.6" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: potential crash on allocation error in ext4_alloc_flex_bg_array() jbd2: fix data races at struct journal_head
| * ext4: potential crash on allocation error in ext4_alloc_flex_bg_array()Dan Carpenter2020-02-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If sbi->s_flex_groups_allocated is zero and the first allocation fails then this code will crash. The problem is that "i--" will set "i" to -1 but when we compare "i >= sbi->s_flex_groups_allocated" then the -1 is type promoted to unsigned and becomes UINT_MAX. Since UINT_MAX is more than zero, the condition is true so we call kvfree(new_groups[-1]). The loop will carry on freeing invalid memory until it crashes. Fixes: 7c990728b99e ("ext4: fix potential race between s_flex_groups online resizing and access") Reviewed-by: Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable@kernel.org Link: https://lore.kernel.org/r/20200228092142.7irbc44yaz3by7nb@kili.mountain Signed-off-by: Theodore Ts'o <tytso@mit.edu>
| * jbd2: fix data races at struct journal_headQian Cai2020-02-291-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | journal_head::b_transaction and journal_head::b_next_transaction could be accessed concurrently as noticed by KCSAN, LTP: starting fsync04 /dev/zero: Can't open blockdev EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) ================================================================== BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2] write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70: __jbd2_journal_refile_buffer+0xdd/0x210 [jbd2] __jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569 jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2] (inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034 kjournald2+0x13b/0x450 [jbd2] kthread+0x1cd/0x1f0 ret_from_fork+0x27/0x50 read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68: jbd2_write_access_granted+0x1b2/0x250 [jbd2] jbd2_write_access_granted at fs/jbd2/transaction.c:1155 jbd2_journal_get_write_access+0x2c/0x60 [jbd2] __ext4_journal_get_write_access+0x50/0x90 [ext4] ext4_mb_mark_diskspace_used+0x158/0x620 [ext4] ext4_mb_new_blocks+0x54f/0xca0 [ext4] ext4_ind_map_blocks+0xc79/0x1b40 [ext4] ext4_map_blocks+0x3b4/0x950 [ext4] _ext4_get_block+0xfc/0x270 [ext4] ext4_get_block+0x3b/0x50 [ext4] __block_write_begin_int+0x22e/0xae0 __block_write_begin+0x39/0x50 ext4_write_begin+0x388/0xb50 [ext4] generic_perform_write+0x15d/0x290 ext4_buffered_write_iter+0x11f/0x210 [ext4] ext4_file_write_iter+0xce/0x9e0 [ext4] new_sync_write+0x29c/0x3b0 __vfs_write+0x92/0xa0 vfs_write+0x103/0x260 ksys_write+0x9d/0x130 __x64_sys_write+0x4c/0x60 do_syscall_64+0x91/0xb05 entry_SYSCALL_64_after_hwframe+0x49/0xbe 5 locks held by fsync04/25724: #0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260 #1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4] #2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2] #3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4] #4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2] irq event stamp: 1407125 hardirqs last enabled at (1407125): [<ffffffff980da9b7>] __find_get_block+0x107/0x790 hardirqs last disabled at (1407124): [<ffffffff980da8f9>] __find_get_block+0x49/0x790 softirqs last enabled at (1405528): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c softirqs last disabled at (1405521): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0 Reported by Kernel Concurrency Sanitizer on: CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 The plain reads are outside of jh->b_state_lock critical section which result in data races. Fix them by adding pairs of READ|WRITE_ONCE(). Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Qian Cai <cai@lca.pw> Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pw Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2020-03-0122-107/+171
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM fixes from Paolo Bonzini: "More bugfixes, including a few remaining "make W=1" issues such as too large frame sizes on some configurations. On the ARM side, the compiler was messing up shadow stacks between EL1 and EL2 code, which is easily fixed with __always_inline" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: check descriptor table exits on instruction emulation kvm: x86: Limit the number of "kvm: disabled by bios" messages KVM: x86: avoid useless copy of cpufreq policy KVM: allow disabling -Werror KVM: x86: allow compiling as non-module with W=1 KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipis KVM: Introduce pv check helpers KVM: let declaration of kvm_get_running_vcpus match implementation KVM: SVM: allocate AVIC data structures based on kvm_amd module parameter arm64: Ask the compiler to __always_inline functions used by KVM at HYP KVM: arm64: Define our own swab32() to avoid a uapi static inline KVM: arm64: Ask the compiler to __always_inline functions used at HYP kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe() KVM: arm/arm64: Fix up includes for trace.h
| * | KVM: VMX: check descriptor table exits on instruction emulationOliver Upton2020-03-011-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM emulates UMIP on hardware that doesn't support it by setting the 'descriptor table exiting' VM-execution control and performing instruction emulation. When running nested, this emulation is broken as KVM refuses to emulate L2 instructions by default. Correct this regression by allowing the emulation of descriptor table instructions if L1 hasn't requested 'descriptor table exiting'. Fixes: 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode") Reported-by: Jan Kiszka <jan.kiszka@web.de> Cc: stable@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | Merge tag 'kvmarm-fixes-5.6-1' of ↵Paolo Bonzini2020-02-2815-77/+84
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm fixes for 5.6, take #1 - Fix compilation on 32bit - Move VHE guest entry/exit into the VHE-specific entry code - Make sure all functions called by the non-VHE HYP code is tagged as __always_inline
| | * | arm64: Ask the compiler to __always_inline functions used by KVM at HYPJames Morse2020-02-224-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM uses some of the static-inline helpers like icache_is_vipt() from its HYP code. This assumes the function is inlined so that the code is mapped to EL2. The compiler may decide not to inline these, and the out-of-line version may not be in the __hyp_text section. Add the additional __always_ hint to these static-inlines that are used by KVM. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200220165839.256881-4-james.morse@arm.com
| | * | KVM: arm64: Define our own swab32() to avoid a uapi static inlineJames Morse2020-02-222-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM uses swab32() when mediating GIC MMIO accesses if the GICV is badly aligned, and the host and guest differ in endianness. arm64 doesn't provide a __arch_swab32(), so __fswab32() is always backed by the macro implementation that the compiler reduces to a single instruction. But the static-inline causes problems for KVM if the compiler chooses not to inline this function, it may not be located in the __hyp_text where __vgic_v2_perform_cpuif_access() needs it. Create our own __kvm_swab32() macro that calls ___constant_swab32() directly. This way we know it will always be inlined. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200220165839.256881-3-james.morse@arm.com
| | * | KVM: arm64: Ask the compiler to __always_inline functions used at HYPJames Morse2020-02-225-28/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On non VHE CPUs, KVM's __hyp_text contains code run at EL2 while the rest of the kernel runs at EL1. This code lives in its own section with start and end markers so we can map it to EL2. The compiler may decide not to inline static-inline functions from the header file. It may also decide not to put these out-of-line functions in the same section, meaning they aren't mapped when called at EL2. Clang-9 does exactly this with __kern_hyp_va() and a few others when x18 is reserved for the shadow call stack. Add the additional __always_ hint to all the static-inlines that are called from a hyp file. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200220165839.256881-2-james.morse@arm.com ---- kvm_get_hyp_vector() pulls in all the regular per-cpu accessors and this_cpu_has_cap(), fortunately its only called for VHE.
| | * | kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe()Mark Rutland2020-02-174-39/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With VHE, running a vCPU always requires the sequence: 1. kvm_arm_vhe_guest_enter(); 2. kvm_vcpu_run_vhe(); 3. kvm_arm_vhe_guest_exit() ... and as we invoke this from the shared arm/arm64 KVM code, 32-bit arm has to provide stubs for all three functions. To simplify the common code, and make it easier to make further modifications to the arm64-specific portions in the near future, let's fold kvm_arm_vhe_guest_enter() and kvm_arm_vhe_guest_exit() into kvm_vcpu_run_vhe(). The 32-bit stubs for kvm_arm_vhe_guest_enter() and kvm_arm_vhe_guest_exit() are removed, as they are no longer used. The 32-bit stub for kvm_vcpu_run_vhe() is left as-is. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200210114757.2889-1-mark.rutland@arm.com
| | * | KVM: arm/arm64: Fix up includes for trace.hJeremy Cline2020-02-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fedora kernel builds on armv7hl began failing recently because kvm_arm_exception_type and kvm_arm_exception_class were undeclared in trace.h. Add the missing include. Fixes: 0e20f5e25556 ("KVM: arm/arm64: Cleanup MMIO handling") Signed-off-by: Jeremy Cline <jcline@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200205134146.82678-1-jcline@redhat.com
| * | | kvm: x86: Limit the number of "kvm: disabled by bios" messagesErwan Velu2020-02-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In older version of systemd(219), at boot time, udevadm is called with : /usr/bin/udevadm trigger --type=devices --action=add" This program generates an echo "add" in /sys/devices/system/cpu/cpu<x>/uevent, leading to the "kvm: disabled by bios" message in case of your Bios disabled the virtualization extensions. On a modern system running up to 256 CPU threads, this pollutes the Kernel logs. This patch offers to ratelimit this message to avoid any userspace program triggering this uevent printing this message too often. This patch is only a workaround but greatly reduce the pollution without breaking the current behavior of printing a message if some try to instantiate KVM on a system that doesn't support it. Note that recent versions of systemd (>239) do not have trigger this behavior. This patch will be useful at least for some using older systemd with recent Kernels. Signed-off-by: Erwan Velu <e.velu@criteo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | KVM: x86: avoid useless copy of cpufreq policyPaolo Bonzini2020-02-281-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct cpufreq_policy is quite big and it is not a good idea to allocate one on the stack. Just use cpufreq_cpu_get and cpufreq_cpu_put which is even simpler. Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | KVM: allow disabling -WerrorPaolo Bonzini2020-02-282-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Restrict -Werror to well-tested configurations and allow disabling it via Kconfig. Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | KVM: x86: allow compiling as non-module with W=1Valdis Klētnieks2020-02-282-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compile error with CONFIG_KVM_INTEL=y and W=1: CC arch/x86/kvm/vmx/vmx.o arch/x86/kvm/vmx/vmx.c:68:32: error: 'vmx_cpu_id' defined but not used [-Werror=unused-const-variable=] 68 | static const struct x86_cpu_id vmx_cpu_id[] = { | ^~~~~~~~~~ cc1: all warnings being treated as errors When building with =y, the MODULE_DEVICE_TABLE macro doesn't generate a reference to the structure (or any code at all). This makes W=1 compiles unhappy. Wrap both in a #ifdef to avoid the issue. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> [Do the same for CONFIG_KVM_AMD. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipisWanpeng Li2020-02-281-12/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nick Desaulniers Reported: When building with: $ make CC=clang arch/x86/ CFLAGS=-Wframe-larger-than=1000 The following warning is observed: arch/x86/kernel/kvm.c:494:13: warning: stack frame size of 1064 bytes in function 'kvm_send_ipi_mask_allbutself' [-Wframe-larger-than=] static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector) ^ Debugging with: https://github.com/ClangBuiltLinux/frame-larger-than via: $ python3 frame_larger_than.py arch/x86/kernel/kvm.o \ kvm_send_ipi_mask_allbutself points to the stack allocated `struct cpumask newmask` in `kvm_send_ipi_mask_allbutself`. The size of a `struct cpumask` is potentially large, as it's CONFIG_NR_CPUS divided by BITS_PER_LONG for the target architecture. CONFIG_NR_CPUS for X86_64 can be as high as 8192, making a single instance of a `struct cpumask` 1024 B. This patch fixes it by pre-allocate 1 cpumask variable per cpu and use it for both pv tlb and pv ipis.. Reported-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | KVM: Introduce pv check helpersWanpeng Li2020-02-281-10/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce some pv check helpers for consistency. Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | KVM: let declaration of kvm_get_running_vcpus match implementationChristian Borntraeger2020-02-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sparse notices that declaration and implementation do not match: arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: warning: incorrect type in return expression (different address spaces) arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: expected struct kvm_vcpu [noderef] <asn:3> ** arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: got struct kvm_vcpu *[noderef] <asn:3> * Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | KVM: SVM: allocate AVIC data structures based on kvm_amd module parameterPaolo Bonzini2020-02-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even if APICv is disabled at startup, the backing page and ir_list need to be initialized in case they are needed later. The only case in which this can be skipped is for userspace irqchip, and that must be done because avic_init_backing_page dereferences vcpu->arch.apic (which is NULL for userspace irqchip). Tested-by: rmuncrief@humanavance.com Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206579 Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | | Merge branch 'i2c/for-current-fixed' of ↵Linus Torvalds2020-03-013-56/+34
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "I2C has three driver bugfixes for you. We agreed on the Mac regression to go in via I2C" * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: macintosh: therm_windtunnel: fix regression when instantiating devices i2c: altera: Fix potential integer overflow i2c: jz4780: silence log flood on txabrt
| * | | | macintosh: therm_windtunnel: fix regression when instantiating devicesWolfram Sang2020-02-291-21/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Removing attach_adapter from this driver caused a regression for at least some machines. Those machines had the sensors described in their DT, too, so they didn't need manual creation of the sensor devices. The old code worked, though, because manual creation came first. Creation of DT devices then failed later and caused error logs, but the sensors worked nonetheless because of the manually created devices. When removing attach_adaper, manual creation now comes later and loses the race. The sensor devices were already registered via DT, yet with another binding, so the driver could not be bound to it. This fix refactors the code to remove the race and only manually creates devices if there are no DT nodes present. Also, the DT binding is updated to match both, the DT and manually created devices. Because we don't know which device creation will be used at runtime, the code to start the kthread is moved to do_probe() which will be called by both methods. Fixes: 3e7bed52719d ("macintosh: therm_windtunnel: drop using attach_adapter") Link: https://bugzilla.kernel.org/show_bug.cgi?id=201723 Reported-by: Erhard Furtner <erhard_f@mailbox.org> Tested-by: Erhard Furtner <erhard_f@mailbox.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org # v4.19+
| * | | | i2c: altera: Fix potential integer overflowGustavo A. R. Silva2020-02-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Factor out 100 from the equation and do 32-bit arithmetic (3 * clk_mhz / 10) instead of 64-bit. Notice that clk_mhz is MHz, so the multiplication will never wrap 32 bits and there is no need for div_u64(). Addresses-Coverity: 1458369 ("Unintentional integer overflow") Fixes: 0560ad576268 ("i2c: altera: Add Altera I2C Controller driver") Suggested-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Thor Thayer <thor.thayer@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * | | | i2c: jz4780: silence log flood on txabrtWolfram Sang2020-02-131-34/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The printout for txabrt is way too talkative and is highly annoying with scanning programs like 'i2cdetect'. Reduce it to the minimum, the rest can be gained by I2C core debugging and datasheet information. Also, make it a debug printout, it won't help the regular user. Fixes: ba92222ed63a ("i2c: jz4780: Add i2c bus controller driver for Ingenic JZ4780") Reported-by: H. Nikolaus Schaller <hns@goldelico.com> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
* | | | | Merge tag 'scsi-fixes' of ↵Linus Torvalds2020-02-298-7/+14
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four small fixes. Three are in drivers for fairly obvious bugs. The fourth is a set of regressions introduced by the compat_ioctl changes because some of the compat updates wrongly replaced .ioctl instead of .compat_ioctl" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four appropriate places scsi: zfcp: fix wrong data and display format of SFP+ temperature scsi: sd_sbc: Fix sd_zbc_report_zones() scsi: libfc: free response frame from GPN_ID
| * | | | | scsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four ↵Adam Williamson2020-02-244-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | appropriate places Arnd Bergmann inadvertently typoed these in d320a9551e394 and 64cbfa96551a; they seem to be the cause of https://bugzilla.redhat.com/show_bug.cgi?id=1801353 , invalid SCSI commands when udev tries to query a DVD drive. [arnd] Found another instance of the same bug, also introduced in my compat_ioctl series. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1801353 Link: https://lore.kernel.org/r/20200219165139.3467320-1-arnd@arndb.de Fixes: c103d6ee69f9 ("compat_ioctl: ide: floppy: add handler") Fixes: 64cbfa96551a ("compat_ioctl: move cdrom commands into cdrom.c") Fixes: d320a9551e39 ("compat_ioctl: scsi: move ioctl handling into drivers") Bisected-by: Chris Murphy <bugzilla@colorremedies.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Adam Williamson <awilliam@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | | | scsi: zfcp: fix wrong data and display format of SFP+ temperatureBenjamin Block2020-02-242-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When implementing support for retrieval of local diagnostic data from the FCP channel, the wrong data format was assumed for the temperature of the local SFP+ connector. The Fibre Channel Link Services (FC-LS-3) specification is not clear on the format of the stored integer, and only after consulting the SNIA specification SFF-8472 did we realize it is stored as two's complement. Thus, the used data and display format is wrong, and highly misleading for users when the temperature should drop below 0°C (however unlikely that may be). To fix this, change the data format in `struct fsf_qtcb_bottom_port` from unsigned to signed, and change the printf format string used to generate `zfcp_sysfs_adapter_diag_sfp_temperature_show()` from `%hu` to `%hd`. Link: https://lore.kernel.org/r/d6e3be5428da5c9490cfff4df7cae868bc9f1a7e.1582039501.git.bblock@linux.ibm.com Fixes: a10a61e807b0 ("scsi: zfcp: support retrieval of SFP Data via Exchange Port Data") Fixes: 6028f7c4cd87 ("scsi: zfcp: introduce sysfs interface for diagnostics of local SFP transceiver") Cc: <stable@vger.kernel.org> # 5.5+ Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | | | scsi: sd_sbc: Fix sd_zbc_report_zones()Damien Le Moal2020-02-241-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The block layer generic blk_revalidate_disk_zones() checks the validity of zone descriptors reported by a disk using the blk_revalidate_zone_cb() callback function executed for each zone descriptor. If a ZBC disk reports invalid zone descriptors, blk_revalidate_disk_zones() returns an error and sd_zbc_read_zones() changes the disk capacity to 0, which in turn results in the gendisk structure capacity to be set to 0. This all works well for the first revalidate pass on a disk and the block layer detects the capactiy change. On the second revalidate pass, blk_revalidate_disk_zones() is called again and sd_zbc_report_zones() executed to check the zones a second time. However, for this second pass, the gendisk capacity is now 0, which results in sd_zbc_report_zones() to do nothing and to report success and no zones. blk_revalidate_disk_zones() in turn returns success and sets the disk queue chunk_sectors limit with zero as no zones were checked, causing a oops to trigger on the BUG_ON(!is_power_of_2(chunk_sectors)) in blk_queue_chunk_sectors(). Fix this by using the sdkp capacity field rather than the gendisk capacity for the report zones loop in sd_zbc_report_zones(). Also add a check to return immediately an error if the sdkp capacity is 0. With this fix, invalid/buggy ZBC disk scan does not trigger a oops and are exposed with a 0 capacity. This change also preserve the chance for the disk to be correctly revalidated on the second revalidate pass as the scsi disk structure capacity field is always set to the disk reported value when sd_zbc_report_zones() is called. Link: https://lore.kernel.org/r/20200219063800.880834-1-damien.lemoal@wdc.com Fixes: d41003513e61 ("block: rework zone reporting") Cc: Cc: <stable@vger.kernel.org> # v5.5 Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | | | scsi: libfc: free response frame from GPN_IDIgor Druzhinin2020-02-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fc_disc_gpn_id_resp() should be the last function using it so free it here to avoid memory leak. Link: https://lore.kernel.org/r/1579013000-14570-2-git-send-email-igor.druzhinin@citrix.com Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | | | | Merge tag 'pci-v5.6-fixes-2' of ↵Linus Torvalds2020-02-282-2/+2
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix build issue on 32-bit ARM with old compilers (Marek Szyprowski) - Update MAINTAINERS for recent Cadence driver file move (Lukas Bulwahn) * tag 'pci-v5.6-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: MAINTAINERS: Correct Cadence PCI driver path PCI: brcmstb: Fix build on 32bit ARM platforms with older compilers
| * | | | | | MAINTAINERS: Correct Cadence PCI driver pathLukas Bulwahn2020-02-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | de80f95ccb9c ("PCI: cadence: Move all files to per-device cadence directory") moved files of the PCI cadence drivers, but did not update the MAINTAINERS entry. Since then, ./scripts/get_maintainer.pl --self-test complains: warning: no file matches F: drivers/pci/controller/pcie-cadence* Repair the MAINTAINERS entry. Link: https://lore.kernel.org/r/20200221185402.4703-1-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
| * | | | | | PCI: brcmstb: Fix build on 32bit ARM platforms with older compilersMarek Szyprowski2020-02-271-1/+1
| | |/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some older compilers have no implementation for the helper for 64-bit unsigned division/modulo, so linking pcie-brcmstb driver causes the "undefined reference to `__aeabi_uldivmod'" error. *rc_bar2_size is always a power of two, because it is calculated as: "1ULL << fls64(entry->res->end - entry->res->start)", so the modulo operation in the subsequent check can be replaced by a simple logical AND with a proper mask. Link: https://lore.kernel.org/r/20200227115146.24515-1-m.szyprowski@samsung.com Fixes: c0452137034b ("PCI: brcmstb: Add Broadcom STB PCIe host controller driver") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
* | | | | | Merge tag 'block-5.6-2020-02-28' of git://git.kernel.dk/linux-blockLinus Torvalds2020-02-2812-70/+136
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block fixes from Jens Axboe: - Passthrough insertion fix (Ming) - Kill off some unused arguments (John) - blktrace RCU fix (Jan) - Dead fields removal for null_blk (Dongli) - NVMe polled IO fix (Bijan) * tag 'block-5.6-2020-02-28' of git://git.kernel.dk/linux-block: nvme-pci: Hold cq_poll_lock while completing CQEs blk-mq: Remove some unused function arguments null_blk: remove unused fields in 'nullb_cmd' blktrace: Protect q->blk_trace with RCU blk-mq: insert passthrough request into hctx->dispatch directly
| * \ \ \ \ \ Merge branch 'nvme-5.6-rc4' of git://git.infradead.org/nvme into block-5.6Jens Axboe2020-02-281-1/+1
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NVMe fix from Keith. * 'nvme-5.6-rc4' of git://git.infradead.org/nvme: nvme-pci: Hold cq_poll_lock while completing CQEs
| | * | | | | | nvme-pci: Hold cq_poll_lock while completing CQEsBijan Mottahedeh2020-02-271-1/+1
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Completions need to consumed in the same order the controller submitted them, otherwise future completion entries may overwrite ones we haven't handled yet. Hold the nvme queue's poll lock while completing new CQEs to prevent another thread from freeing command tags for reuse out-of-order. Fixes: dabcefab45d3 ("nvme: provide optimized poll function for separate poll queues") Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Keith Busch <kbusch@kernel.org>
| * | | | | | blk-mq: Remove some unused function argumentsJohn Garry2020-02-264-11/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The struct blk_mq_hw_ctx pointer argument in blk_mq_put_tag(), blk_mq_poll_nsecs(), and blk_mq_poll_hybrid_sleep() is unused, so remove it. Overall obj code size shows a minor reduction, before: text data bss dec hex filename 27306 1312 0 28618 6fca block/blk-mq.o 4303 272 0 4575 11df block/blk-mq-tag.o after: 27282 1312 0 28594 6fb2 block/blk-mq.o 4311 272 0 4583 11e7 block/blk-mq-tag.o Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: John Garry <john.garry@huawei.com> -- This minor patch had been carried as part of the blk-mq shared tags RFC, I'd rather not carry it anymore as it required rebasing, so now or never.. Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | | | | null_blk: remove unused fields in 'nullb_cmd'Dongli Zhang2020-02-252-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'list', 'll_list' and 'csd' are no longer used. The 'list' is not used since it was introduced by commit f2298c0403b0 ("null_blk: multi queue aware block test driver"). The 'll_list' is no longer used since commit 3c395a969acc ("null_blk: set a separate timer for each command"). The 'csd' is no longer used since commit ce2c350b2cfe ("null_blk: use blk_complete_request and blk_mq_complete_request"). Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | | | | blktrace: Protect q->blk_trace with RCUJan Kara2020-02-253-37/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KASAN is reporting that __blk_add_trace() has a use-after-free issue when accessing q->blk_trace. Indeed the switching of block tracing (and thus eventual freeing of q->blk_trace) is completely unsynchronized with the currently running tracing and thus it can happen that the blk_trace structure is being freed just while __blk_add_trace() works on it. Protect accesses to q->blk_trace by RCU during tracing and make sure we wait for the end of RCU grace period when shutting down tracing. Luckily that is rare enough event that we can afford that. Note that postponing the freeing of blk_trace to an RCU callback should better be avoided as it could have unexpected user visible side-effects as debugfs files would be still existing for a short while block tracing has been shut down. Link: https://bugzilla.kernel.org/show_bug.cgi?id=205711 CC: stable@vger.kernel.org Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Tested-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reported-by: Tristan Madani <tristmd@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | | | | blk-mq: insert passthrough request into hctx->dispatch directlyMing Lei2020-02-254-16/+29
| | |_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For some reason, device may be in one situation which can't handle FS request, so STS_RESOURCE is always returned and the FS request will be added to hctx->dispatch. However passthrough request may be required at that time for fixing the problem. If passthrough request is added to scheduler queue, there isn't any chance for blk-mq to dispatch it given we prioritize requests in hctx->dispatch. Then the FS IO request may never be completed, and IO hang is caused. So passthrough request has to be added to hctx->dispatch directly for fixing the IO hang. Fix this issue by inserting passthrough request into hctx->dispatch directly together withing adding FS request to the tail of hctx->dispatch in blk_mq_dispatch_rq_list(). Actually we add FS request to tail of hctx->dispatch at default, see blk_mq_request_bypass_insert(). Then it becomes consistent with original legacy IO request path, in which passthrough request is always added to q->queue_head. Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | | | | Merge tag 'io_uring-5.6-2020-02-28' of git://git.kernel.dk/linux-blockLinus Torvalds2020-02-283-91/+74
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull io_uring fixes from Jens Axboe: - Fix for a race with IOPOLL used with SQPOLL (Xiaoguang) - Only show ->fdinfo if procfs is enabled (Tobias) - Fix for a chain with multiple personalities in the SQEs - Fix for a missing free of personality idr on exit - Removal of the spin-for-work optimization - Fix for next work lookup on request completion - Fix for non-vec read/write result progation in case of links - Fix for a fileset references on switch - Fix for a recvmsg/sendmsg 32-bit compatability mode * tag 'io_uring-5.6-2020-02-28' of git://git.kernel.dk/linux-block: io_uring: fix 32-bit compatability with sendmsg/recvmsg io_uring: define and set show_fdinfo only if procfs is enabled io_uring: drop file set ref put/get on switch io_uring: import_single_range() returns 0/-ERROR io_uring: pick up link work on submit reference drop io-wq: ensure work->task_pid is cleared on init io-wq: remove spin-for-work optimization io_uring: fix poll_list race for SETUP_IOPOLL|SETUP_SQPOLL io_uring: fix personality idr leak io_uring: handle multiple personalities in link chains
| * | | | | | io_uring: fix 32-bit compatability with sendmsg/recvmsgJens Axboe2020-02-271-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We must set MSG_CMSG_COMPAT if we're in compatability mode, otherwise the iovec import for these commands will not do the right thing and fail the command with -EINVAL. Found by running the test suite compiled as 32-bit. Cc: stable@vger.kernel.org Fixes: aa1fa28fc73e ("io_uring: add support for recvmsg()") Fixes: 0fa03c624d8f ("io_uring: add support for sendmsg()") Signed-off-by: Jens Axboe <axboe@kernel.dk>