linux - linux

	Commit message (Collapse)	Author	Files	Lines
2020-12-14	ceph: remove redundant assignment to variable i	Colin Ian King	1	-1/+1
	The variable i is being initialized with a value that is never read and it is being updated later with a new value in a for-loop. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: add ceph.caps vxattr	Luis Henriques	1	-0/+27
	Add a new vxattr that allows userspace to list the caps for a specific directory or file. [ jlayton: change format delimiter to '/' ] Signed-off-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: when filling trace, call ceph_get_inode outside of mutexes	Jeff Layton	2	-9/+21
	Geng Jichao reported a rather complex deadlock involving several moving parts: 1) readahead is issued against an inode and some of its pages are locked while the read is in flight 2) the same inode is evicted from the cache, and this task gets stuck waiting for the page lock because of the above readahead 3) another task is processing a reply trace, and looks up the inode being evicted while holding the s_mutex. That ends up waiting for the eviction to complete 4) a write reply for an unrelated inode is then processed in the ceph_con_workfn job. It calls ceph_check_caps after putting wrbuffer caps, and that gets stuck waiting on the s_mutex held by 3. The reply to "1" is stuck behind the write reply in "4", so we deadlock at that point. This patch changes the trace processing to call ceph_get_inode outside of the s_mutex and snap_rwsem, which should break the cycle above. [ idryomov: break unnecessarily long lines ] URL: https://tracker.ceph.com/issues/47998 Reported-by: Geng Jichao <gengjichao@jd.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	Revert "ceph: allow rename operation under different quota realms"	Luis Henriques	3	-64/+6
	This reverts commit dffdcd71458e699e839f0bf47c3d42d64210b939. When doing a rename across quota realms, there's a corner case that isn't handled correctly. Here's a testcase: mkdir files limit truncate files/file -s 10G setfattr limit -n ceph.quota.max_bytes -v 1000000 mv files limit/ The above will succeed because ftruncate(2) won't immediately notify the MDSs with the new file size, and thus the quota realms stats won't be updated. Since the possible fixes for this issue would have a huge performance impact, the solution for now is to simply revert to returning -EXDEV when doing a cross quota realms rename. URL: https://tracker.ceph.com/issues/48203 Signed-off-by: Luis Henriques <lhenriques@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: fix inode refcount leak when ceph_fill_inode on non-I_NEW inode fails	Jeff Layton	1	-0/+2
	Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: downgrade warning from mdsmap decode to debug	Luis Henriques	1	-2/+2
	While the MDS cluster is unstable and changing state the client may get mdsmap updates that will trigger warnings: [144692.478400] ceph: mdsmap_decode got incorrect state(up:standby-replay) [144697.489552] ceph: mdsmap_decode got incorrect state(up:standby-replay) [144697.489580] ceph: mdsmap_decode got incorrect state(up:standby-replay) This patch downgrades these warnings to debug, as they may flood the logs if the cluster is unstable for a while. Signed-off-by: Luis Henriques <lhenriques@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: fix race in concurrent __ceph_remove_cap invocations	Luis Henriques	1	-2/+9
	A NULL pointer dereference may occur in __ceph_remove_cap with some of the callbacks used in ceph_iterate_session_caps, namely trim_caps_cb and remove_session_caps_cb. Those callers hold the session->s_mutex, so they are prevented from concurrent execution, but ceph_evict_inode does not. Since the callers of this function hold the i_ceph_lock, the fix is simply a matter of returning immediately if caps->ci is NULL. Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/43272 Suggested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Luis Henriques <lhenriques@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: pass down the flags to grab_cache_page_write_begin	Jeff Layton	1	-1/+1
	write_begin operations are passed a flags parameter that we need to mirror here, so that we don't (e.g.) recurse back into filesystem code inappropriately. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: add ceph.{cluster_fsid/client_id} vxattrs	Xiubo Li	1	-0/+42
	These two vxattrs will only exist in local client side, with which we can easily know which mountpoint the file belongs to and also they can help locate the debugfs path quickly. URL: https://tracker.ceph.com/issues/48057 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: add status debugfs file	Xiubo Li	2	-0/+21
	This will help list some useful client side info, like the client entity address/name and blocklisted status, etc. URL: https://tracker.ceph.com/issues/48057 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	libceph: remove unused port macros	Liu, Changcheng	1	-9/+0
	1. monitor's default port is defined by CEPH_MON_PORT 2. CEPH_PORT_START and CEPH_PORT_LAST are not needed. Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: ensure we have Fs caps when fetching dir link count	Jeff Layton	1	-5/+14
	The link count for a directory is defined as inode->i_subdirs + 2, (for "." and ".."). i_subdirs is only populated when Fs caps are held. Ensure we grab Fs caps when fetching the link count for a directory. [ idryomov: break unnecessarily long line ] URL: https://tracker.ceph.com/issues/48125 Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: send dentry lease metrics to MDS daemon	Xiubo Li	2	-3/+29
	For the old ceph version, if it received this one metric message containing the dentry lease metric info, it will just ignore it. URL: https://tracker.ceph.com/issues/43423 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: acquire Fs caps when getting dir stats	Jeff Layton	1	-3/+6
	We only update the inode's dirstats when we have Fs caps from the MDS. Declare a new VXATTR_FLAG_DIRSTAT that we set on all dirstats, and have the vxattr handling code acquire those caps when it's set. URL: https://tracker.ceph.com/issues/48104 Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: fix up some warnings on W=1 builds	Jeff Layton	3	-30/+14
	Convert some decodes into unused variables into skips, and fix up some non-kerneldoc comment headers to not start with "/**". Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: queue MDS requests to REJECTED sessions when CLEANRECOVER is set	Jeff Layton	1	-4/+14
	Ilya noticed that the first access to a blacklisted mount would often get back -EACCES, but then subsequent calls would be OK. The problem is in __do_request. If the session is marked as REJECTED, a hard error is returned instead of waiting for a new session to come into being. When the session is REJECTED and the mount was done with recover_session=clean, queue the request to the waiting_for_map queue, which will be awoken after tearing down the old session. We can only do this for sync requests though, so check for async ones first and just let the callers redrive a sync request. URL: https://tracker.ceph.com/issues/47385 Reported-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: remove timeout on allowing reconnect after blocklisting	Jeff Layton	2	-6/+0
	30 minutes is a long time to wait, and this makes it difficult to test the feature by manually blocklisting clients. Remove the timeout infrastructure and just allow the client to reconnect at will. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: add new RECOVER mount_state when recovering session	Jeff Layton	6	-10/+17
	When recovering a session (a'la recover_session=clean), we want to do all of the operations that we do on a forced umount, but changing the mount state to SHUTDOWN is can cause queued MDS requests to fail when the session comes back. Most of those can idle until the session is recovered in this situation. Reserve SHUTDOWN state for forced umount, and make a new RECOVER state for the forced reconnect situation. Change several tests for equality with SHUTDOWN to test for that or RECOVER. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: make fsc->mount_state an int	Jeff Layton	1	-1/+1
	This field is an unsigned long currently, which is a bit of a waste on most arches since this just holds an enum. Make it (signed) int instead. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14	ceph: don't WARN when removing caps due to blocklisting	Jeff Layton	1	-1/+2
	We expect to remove dirty caps when the client is blocklisted. Don't throw a warning in that case. [ idryomov: break unnecessarily long line ] Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-13	Linux 5.10v5.10	Linus Torvalds	1	-1/+1

2020-12-12	md: change mddev 'chunk_sectors' from int to unsigned	Mike Snitzer	1	-2/+2
	Commit e2782f560c29 ("Revert "dm raid: remove unnecessary discard limits for raid10"") exposed compiler warnings introduced by commit e0910c8e4f87 ("dm raid: fix discard limits for raid1 and raid10"): In file included from ./include/linux/kernel.h:14, from ./include/asm-generic/bug.h:20, from ./arch/x86/include/asm/bug.h:93, from ./include/linux/bug.h:5, from ./include/linux/mmdebug.h:5, from ./include/linux/gfp.h:5, from ./include/linux/slab.h:15, from drivers/md/dm-raid.c:8: drivers/md/dm-raid.c: In function ‘raid_io_hints’: ./include/linux/minmax.h:18:28: warning: comparison of distinct pointer types lacks a cast (!!(sizeof((typeof(x) )1 == (typeof(y) )1))) ^~ ./include/linux/minmax.h:32:4: note: in expansion of macro ‘__typecheck’ (__typecheck(x, y) && __no_side_effects(x, y)) ^~~~~~~~~~~ ./include/linux/minmax.h:42:24: note: in expansion of macro ‘__safe_cmp’ __builtin_choose_expr(__safe_cmp(x, y), \ ^~~~~~~~~~ ./include/linux/minmax.h:51:19: note: in expansion of macro ‘__careful_cmp’ #define min(x, y) __careful_cmp(x, y, <) ^~~~~~~~~~~~~ ./include/linux/minmax.h:84:39: note: in expansion of macro ‘min’ __x == 0 ? __y : ((__y == 0) ? __x : min(__x, __y)); }) ^~~ drivers/md/dm-raid.c:3739:33: note: in expansion of macro ‘min_not_zero’ limits->max_discard_sectors = min_not_zero(rs->md.chunk_sectors, ^~~~~~~~~~~~ Fix this by changing the chunk_sectors member of 'struct mddev' from int to 'unsigned int' to match the type used for the 'chunk_sectors' member of 'struct queue_limits'. Various MD code still uses 'int' but none of it appears to ever make use of signed int; and storing positive signed int in unsigned is perfectly safe. Reported-by: Song Liu <songliubraving@fb.com> Fixes: e2782f560c29 ("Revert "dm raid: remove unnecessary discard limits for raid10"") Fixes: e0910c8e4f87 ("dm raid: fix discard limits for raid1 and raid10") Cc: stable@vger,kernel.org # e0910c8e4f87 was marked for stable@ Signed-off-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Song Liu <song@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-12	x86/kprobes: Fix optprobe to detect INT3 padding correctly	Masami Hiramatsu	1	-2/+20
	Commit 7705dc855797 ("x86/vmlinux: Use INT3 instead of NOP for linker fill bytes") changed the padding bytes between functions from NOP to INT3. However, when optprobe decodes a target function it finds INT3 and gives up the jump optimization. Instead of giving up any INT3 detection, check whether the rest of the bytes to the end of the function are INT3. If all of them are INT3, those come from the linker. In that case, continue the optprobe jump optimization. [ bp: Massage commit message. ] Fixes: 7705dc855797 ("x86/vmlinux: Use INT3 instead of NOP for linker fill bytes") Reported-by: Adam Zabrocki <pi3@pi3.com.pl> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/160767025681.3880685.16021570341428835411.stgit@devnote2
2020-12-12	Input: goodix - add upside-down quirk for Teclast X98 Pro tablet	Simon Beginn	1	-0/+12
	The touchscreen on the Teclast x98 Pro is also mounted upside-down in relation to the display orientation. Signed-off-by: Simon Beginn <linux@simonmicro.de> Signed-off-by: Bastien Nocera <hadess@hadess.net> Link: https://lore.kernel.org/r/20201117004253.27A5A27EFD@localhost Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-12-12	tools/kvm_stat: Exempt time-based counters	Stefan Raspl	1	-1/+5
	The new counters halt_poll_success_ns and halt_poll_fail_ns do not count events. Instead they provide a time, and mess up our statistics. Therefore, we should exclude them. Removal is currently implemented with an exempt list. If more counters like these appear, we can think about a more general rule like excluding all fields name "*_ns", in case that's a standing convention. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Tested-and-reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20201208210829.101324-1-raspl@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-12	KVM: mmu: Fix SPTE encoding of MMIO generation upper half	Maciej S. Szmigiero	3	-10/+21
	Commit cae7ed3c2cb0 ("KVM: x86: Refactor the MMIO SPTE generation handling") cleaned up the computation of MMIO generation SPTE masks, however it introduced a bug how the upper part was encoded: SPTE bits 52-61 were supposed to contain bits 10-19 of the current generation number, however a missing shift encoded bits 1-10 there instead (mostly duplicating the lower part of the encoded generation number that then consisted of bits 1-9). In the meantime, the upper part was shrunk by one bit and moved by subsequent commits to become an upper half of the encoded generation number (bits 9-17 of bits 0-17 encoded in a SPTE). In addition to the above, commit 56871d444bc4 ("KVM: x86: fix overlap between SPTE_MMIO_MASK and generation") has changed the SPTE bit range assigned to encode the generation number and the total number of bits encoded but did not update them in the comment attached to their defines, nor in the KVM MMU doc. Let's do it here, too, since it is too trivial thing to warrant a separate commit. Fixes: cae7ed3c2cb0 ("KVM: x86: Refactor the MMIO SPTE generation handling") Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <156700708db2a5296c5ed7a8b9ac71f1e9765c85.1607129096.git.maciej.szmigiero@oracle.com> Cc: stable@vger.kernel.org [Reorganize macros so that everything is computed from the bit ranges. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-11	bpf: Fix enum names for bpf_this_cpu_ptr() and bpf_per_cpu_ptr() helpers	Andrii Nakryiko	4	-8/+8
	Remove bpf_ prefix, which causes these helpers to be reported in verifier dump as bpf_bpf_this_cpu_ptr() and bpf_bpf_per_cpu_ptr(), respectively. Lets fix it as long as it is still possible before UAPI freezes on these helpers. Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	mm/hugetlb: clear compound_nr before freeing gigantic pages	Gerald Schaefer	1	-0/+1
	Commit 1378a5ee451a ("mm: store compound_nr as well as compound_order") added compound_nr counter to first tail struct page, overlaying with page->mapping. The overlay itself is fine, but while freeing gigantic hugepages via free_contig_range(), a "bad page" check will trigger for non-NULL page->mapping on the first tail page: BUG: Bad page state in process bash pfn:380001 page:00000000c35f0856 refcount:0 mapcount:0 mapping:00000000126b68aa index:0x0 pfn:0x380001 aops:0x0 flags: 0x3ffff00000000000() raw: 3ffff00000000000 0000000000000100 0000000000000122 0000000100000000 raw: 0000000000000000 0000000000000000 ffffffff00000000 0000000000000000 page dumped because: non-NULL mapping Modules linked in: CPU: 6 PID: 616 Comm: bash Not tainted 5.10.0-rc7-next-20201208 #1 Hardware name: IBM 3906 M03 703 (LPAR) Call Trace: show_stack+0x6e/0xe8 dump_stack+0x90/0xc8 bad_page+0xd6/0x130 free_pcppages_bulk+0x26a/0x800 free_unref_page+0x6e/0x90 free_contig_range+0x94/0xe8 update_and_free_page+0x1c4/0x2c8 free_pool_huge_page+0x11e/0x138 set_max_huge_pages+0x228/0x300 nr_hugepages_store_common+0xb8/0x130 kernfs_fop_write+0xd2/0x218 vfs_write+0xb0/0x2b8 ksys_write+0xac/0xe0 system_call+0xe6/0x288 Disabling lock debugging due to kernel taint This is because only the compound_order is cleared in destroy_compound_gigantic_page(), and compound_nr is set to 1U << order == 1 for order 0 in set_compound_order(page, 0). Fix this by explicitly clearing compound_nr for first tail page after calling set_compound_order(page, 0). Link: https://lkml.kernel.org/r/20201208182813.66391-2-gerald.schaefer@linux.ibm.com Fixes: 1378a5ee451a ("mm: store compound_nr as well as compound_order") Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: <stable@vger.kernel.org> [5.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	kasan: fix object remaining in offline per-cpu quarantine	Kuan-Ying Lee	1	-0/+39
	We hit this issue in our internal test. When enabling generic kasan, a kfree()'d object is put into per-cpu quarantine first. If the cpu goes offline, object still remains in the per-cpu quarantine. If we call kmem_cache_destroy() now, slub will report "Objects remaining" error. ============================================================================= BUG test_module_slab (Not tainted): Objects remaining in test_module_slab on __kmem_cache_shutdown() ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: Slab 0x(____ptrval____) objects=34 used=1 fp=0x(____ptrval____) flags=0x2ffff00000010200 CPU: 3 PID: 176 Comm: cat Tainted: G B 5.10.0-rc1-00007-g4525c8781ec0-dirty #10 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x2b0 show_stack+0x18/0x68 dump_stack+0xfc/0x168 slab_err+0xac/0xd4 __kmem_cache_shutdown+0x1e4/0x3c8 kmem_cache_destroy+0x68/0x130 test_version_show+0x84/0xf0 module_attr_show+0x40/0x60 sysfs_kf_seq_show+0x128/0x1c0 kernfs_seq_show+0xa0/0xb8 seq_read+0x1f0/0x7e8 kernfs_fop_read+0x70/0x338 vfs_read+0xe4/0x250 ksys_read+0xc8/0x180 __arm64_sys_read+0x44/0x58 el0_svc_common.constprop.0+0xac/0x228 do_el0_svc+0x38/0xa0 el0_sync_handler+0x170/0x178 el0_sync+0x174/0x180 INFO: Object 0x(____ptrval____) @offset=15848 INFO: Allocated in test_version_show+0x98/0xf0 age=8188 cpu=6 pid=172 stack_trace_save+0x9c/0xd0 set_track+0x64/0xf0 alloc_debug_processing+0x104/0x1a0 ___slab_alloc+0x628/0x648 __slab_alloc.isra.0+0x2c/0x58 kmem_cache_alloc+0x560/0x588 test_version_show+0x98/0xf0 module_attr_show+0x40/0x60 sysfs_kf_seq_show+0x128/0x1c0 kernfs_seq_show+0xa0/0xb8 seq_read+0x1f0/0x7e8 kernfs_fop_read+0x70/0x338 vfs_read+0xe4/0x250 ksys_read+0xc8/0x180 __arm64_sys_read+0x44/0x58 el0_svc_common.constprop.0+0xac/0x228 kmem_cache_destroy test_module_slab: Slab cache still has objects Register a cpu hotplug function to remove all objects in the offline per-cpu quarantine when cpu is going offline. Set a per-cpu variable to indicate this cpu is offline. [qiang.zhang@windriver.com: fix slab double free when cpu-hotplug] Link: https://lkml.kernel.org/r/20201204102206.20237-1-qiang.zhang@windriver.com Link: https://lkml.kernel.org/r/1606895585-17382-2-git-send-email-Kuan-Ying.Lee@mediatek.com Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Signed-off-by: Zqiang <qiang.zhang@windriver.com> Suggested-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: Guangye Yang <guangye.yang@mediatek.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Nicholas Tang <nicholas.tang@mediatek.com> Cc: Miles Chen <miles.chen@mediatek.com> Cc: Qian Cai <qcai@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	elfcore: fix building with clang	Arnd Bergmann	3	-27/+22
	kernel/elfcore.c only contains weak symbols, which triggers a bug with clang in combination with recordmcount: Cannot find symbol for section 2: .text. kernel/elfcore.o: failed Move the empty stubs into linux/elfcore.h as inline functions. As only two architectures use these, just use the architecture specific Kconfig symbols to key off the declaration. Link: https://lkml.kernel.org/r/20201204165742.3815221-2-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Barret Rhoden <brho@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	initramfs: fix clang build failure	Arnd Bergmann	1	-1/+1
	There is only one function in init/initramfs.c that is in the .text section, and it is marked __weak. When building with clang-12 and the integrated assembler, this leads to a bug with recordmcount: ./scripts/recordmcount "init/initramfs.o" Cannot find symbol for section 2: .text. init/initramfs.o: failed I'm not quite sure what exactly goes wrong, but I notice that this function is only ever called from an __init function, and normally inlined. Marking it __init as well is clearly correct and it leads to recordmcount no longer complaining. Link: https://lkml.kernel.org/r/20201204165742.3815221-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Barret Rhoden <brho@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	kbuild: avoid static_assert for genksyms	Arnd Bergmann	1	-0/+5
	genksyms does not know or care about the _Static_assert() built-in, and sometimes falls back to ignoring the later symbols, which causes undefined behavior such as WARNING: modpost: EXPORT symbol "ethtool_set_ethtool_phy_ops" [vmlinux] version generation failed, symbol will not be versioned. ld: net/ethtool/common.o: relocation R_AARCH64_ABS32 against `__crc_ethtool_set_ethtool_phy_ops' can not be used when making a shared object net/ethtool/common.o:(_ftrace_annotated_branch+0x0): dangerous relocation: unsupported relocation Redefine static_assert for genksyms to avoid that. Link: https://lkml.kernel.org/r/20201203230955.1482058-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Ard Biesheuvel <ardb@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Michal Marek <michal.lkml@markovi.net> Cc: Kees Cook <keescook@chromium.org> Cc: Rikard Falkeborn <rikard.falkeborn@gmail.com> Cc: Marco Elver <elver@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	selftest/fpu: avoid clang warning	Arnd Bergmann	1	-1/+2
	With extra warnings enabled, clang complains about the redundant -mhard-float argument: clang: error: argument unused during compilation: '-mhard-float' [-Werror,-Wunused-command-line-argument] Move this into the gcc-only part of the Makefile. Link: https://lkml.kernel.org/r/20201203223652.1320700-1-arnd@kernel.org Fixes: 4185b3b92792 ("selftests/fpu: Add an FPU selftest") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Petteri Aimonen <jpa@git.mail.kapsi.fi> Cc: Borislav Petkov <bp@suse.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	proc: use untagged_addr() for pagemap_read addresses	Miles Chen	1	-2/+6
	When we try to visit the pagemap of a tagged userspace pointer, we find that the start_vaddr is not correct because of the tag. To fix it, we should untag the userspace pointers in pagemap_read(). I tested with 5.10-rc4 and the issue remains. Explanation from Catalin in [1]: "Arguably, that's a user-space bug since tagged file offsets were never supported. In this case it's not even a tag at bit 56 as per the arm64 tagged address ABI but rather down to bit 47. You could say that the problem is caused by the C library (malloc()) or whoever created the tagged vaddr and passed it to this function. It's not a kernel regression as we've never supported it. Now, pagemap is a special case where the offset is usually not generated as a classic file offset but rather derived by shifting a user virtual address. I guess we can make a concession for pagemap (only) and allow such offset with the tag at bit (56 - PAGE_SHIFT + 3)" My test code is based on [2]: A userspace pointer which has been tagged by 0xb4: 0xb400007662f541c8 userspace program: uint64 OsLayer::VirtualToPhysical(void vaddr) { uint64 frame, paddr, pfnmask, pagemask; int pagesize = sysconf(_SC_PAGESIZE); off64_t off = ((uintptr_t)vaddr) / pagesize 8; // off = 0xb400007662f541c8 / pagesize * 8 = 0x5a00003b317aa0 int fd = open(kPagemapPath, O_RDONLY); ... if (lseek64(fd, off, SEEK_SET) != off \|\| read(fd, &frame, 8) != 8) { int err = errno; string errtxt = ErrorString(err); if (fd >= 0) close(fd); return 0; } ... } kernel fs/proc/task_mmu.c: static ssize_t pagemap_read(struct file file, char __user buf, size_t count, loff_t ppos) { ... src = ppos; svpfn = src / PM_ENTRY_BYTES; // svpfn == 0xb400007662f54 start_vaddr = svpfn << PAGE_SHIFT; // start_vaddr == 0xb400007662f54000 end_vaddr = mm->task_size; /* watch out for wraparound */ // svpfn == 0xb400007662f54 // (mm->task_size >> PAGE) == 0x8000000 if (svpfn > mm->task_size >> PAGE_SHIFT) // the condition is true because of the tag 0xb4 start_vaddr = end_vaddr; ret = 0; while (count && (start_vaddr < end_vaddr)) { // we cannot visit correct entry because start_vaddr is set to end_vaddr int len; unsigned long end; ... } ... } [1] https://lore.kernel.org/patchwork/patch/1343258/ [2] https://github.com/stressapptest/stressapptest/blob/master/src/os.cc#L158 Link: https://lkml.kernel.org/r/20201204024347.8295-1-miles.chen@mediatek.com Signed-off-by: Miles Chen <miles.chen@mediatek.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Will Deacon <will@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com> Cc: <stable@vger.kernel.org> [5.4-] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	revert "mm/filemap: add static for function __add_to_page_cache_locked"	Andrew Morton	1	-1/+1
	Revert commit 3351b16af494 ("mm/filemap: add static for function __add_to_page_cache_locked") due to incompatibility with ALLOW_ERROR_INJECTION which result in build errors. Link: https://lkml.kernel.org/r/CAADnVQJ6tmzBXvtroBuEH6QA0H+q7yaSKxrVvVxhqr3KBZdEXg@mail.gmail.com Tested-by: Justin Forbes <jmforbes@linuxtx.org> Tested-by: Greg Thelen <gthelen@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Michal Kubecek <mkubecek@suse.cz> Cc: Alex Shi <alex.shi@linux.alibaba.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Tony Luck <tony.luck@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11	Input: cm109 - do not stomp on control URB	Dmitry Torokhov	1	-2/+5
	We need to make sure we are not stomping on the control URB that was issued when opening the device when attempting to toggle buzzer. To do that we need to mark it as pending in cm109_open(). Reported-and-tested-by: syzbot+150f793ac5bc18eee150@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-12-11	mtd: rawnand: xway: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: d525914b5bd8 ("mtd: rawnand: xway: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-10-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: socrates: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: b36bf0a0fe5d ("mtd: rawnand: socrates: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-9-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: plat_nand: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: 612e048e6aab ("mtd: rawnand: plat_nand: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-8-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: pasemi: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: 8fc6f1f042b2 ("mtd: rawnand: pasemi: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-7-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: orion: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Reported-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Fixes: 553508cec2e8 ("mtd: rawnand: orion: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-6-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: mpc5121: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: 6dd09f775b72 ("mtd: rawnand: mpc5121: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-5-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: gpio: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: f6341f6448e0 ("mtd: rawnand: gpio: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-4-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: au1550: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: dbffc8ccdf3a ("mtd: rawnand: au1550: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-3-miquel.raynal@bootlin.com
2020-12-11	mtd: rawnand: ams-delta: Do not force a particular software ECC engine	Miquel Raynal	1	-1/+3
	Originally, commit d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") kind of broke the logic around the initialization of several ECC engines. Unfortunately, the fix (which indeed moved the ECC initialization to the right place) did not take into account the fact that a different ECC algorithm could have been used thanks to a DT property, considering the "Hamming" algorithm entry a configuration while it was only a default. Add the necessary logic to be sure Hamming keeps being only a default. Fixes: 59d93473323a ("mtd: rawnand: ams-delta: Move the ECC initialization to ->attach_chip()") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201203190340.15522-2-miquel.raynal@bootlin.com
2020-12-11	Revert "scsi: storvsc: Validate length of incoming packet in ↵	Andrea Parri (Microsoft)	1	-5/+0
	storvsc_on_channel_callback()" This reverts commit 3b8c72d076c42bf27284cda7b2b2b522810686f8. Dexuan reported a regression where StorVSC fails to probe a device (and where, consequently, the VM may fail to boot). The root-cause analysis led to a long-standing race condition that is exposed by the validation /commit in question. Let's put the new validation aside until a proper solution for that race condition is in place. Link: https://lore.kernel.org/r/20201211131404.21359-1-parri.andrea@gmail.com Fixes: 3b8c72d076c4 ("scsi: storvsc: Validate length of incoming packet in storvsc_on_channel_callback()") Cc: Dexuan Cui <decui@microsoft.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-11	RISC-V: Define get_cycles64() regardless of M-mode	Palmer Dabbelt	1	-2/+2
	The timer driver uses get_cycles64() unconditionally to obtain the current time. A recent refactoring lost the common definition for some configs, which is now the only one we need. Fixes: d5be89a8d118 ("RISC-V: Resurrect the MMIO timer implementation for M-mode systems") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-12-11	drm/i915/display: Go softly softly on initial modeset failure	Chris Wilson	1	-1/+1
	Reduce the module/device probe error into a mere debug to hide issues where the initial modeset is failing (after lies told by hw probe) and the system hangs with a livelock in cleaning up the failed commit. Reported-by: H.J. Lu <hjl.tools@gmail.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=210619 Fixes: b3bf99daaee9 ("drm/i915/display: Defer initial modeset until after GGTT is initialised") Fixes: ccc9e67ab26f ("drm/i915/display: Defer initial modeset until after GGTT is initialised") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: "Ville Syrjälä" <ville.syrjala@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: H.J. Lu <hjl.tools@gmail.com> Cc: Dave Airlie <airlied@redhat.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201210230741.17140-1-chris@chris-wilson.co.uk
2020-12-10	x86/apic/vector: Fix ordering in vector assignment	Thomas Gleixner	1	-10/+14
	Prarit reported that depending on the affinity setting the ' irq $N: Affinity broken due to vector space exhaustion.' message is showing up in dmesg, but the vector space on the CPUs in the affinity mask is definitely not exhausted. Shung-Hsi provided traces and analysis which pinpoints the problem: The ordering of trying to assign an interrupt vector in assign_irq_vector_any_locked() is simply wrong if the interrupt data has a valid node assigned. It does: 1) Try the intersection of affinity mask and node mask 2) Try the node mask 3) Try the full affinity mask 4) Try the full online mask Obviously #2 and #3 are in the wrong order as the requested affinity mask has to take precedence. In the observed cases #1 failed because the affinity mask did not contain CPUs from node 0. That made it allocate a vector from node 0, thereby breaking affinity and emitting the misleading message. Revert the order of #2 and #3 so the full affinity mask without the node intersection is tried before actually affinity is broken. If no node is assigned then only the full affinity mask and if that fails the full online mask is tried. Fixes: d6ffc6ac83b1 ("x86/vector: Respect affinity mask in irq descriptor") Reported-by: Prarit Bhargava <prarit@redhat.com> Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87ft4djtyp.fsf@nanos.tec.linutronix.de
2020-12-10	NFS: Disable READ_PLUS by default	Anna Schumaker	2	-1/+10
	We've been seeing failures with xfstests generic/091 and generic/263 when using READ_PLUS. I've made some progress on these issues, and the tests fail later on but still don't pass. Let's disable READ_PLUS by default until we can work out what is going on. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>