| Commit message (Collapse) | Author | Files | Lines |
|
Commit 3ee34eabac2a ("lib/stackdepot: fix first entry having a 0-handle")
changed the meaning of the pool_index field to mean "the pool index plus
1". This made the code accessing this field less self-documenting, as
well as causing debuggers such as drgn to not be able to easily remain
compatible with both old and new kernels, because they typically do that
by testing for presence of the new field. Because stackdepot is a
debugging tool, we should make sure that it is debugger friendly.
Therefore, give the field a different name to improve readability as well
as enabling debugger backwards compatibility.
This is needed in 6.9, which would otherwise become an odd release with
the new semantics and old name so debuggers wouldn't recognize the new
semantics there.
Fixes: 3ee34eabac2a ("lib/stackdepot: fix first entry having a 0-handle")
Link: https://lkml.kernel.org/r/20240402001500.53533-1-pcc@google.com
Link: https://linux-review.googlesource.com/id/Ib3e70c36c1d230dd0a118dc22649b33e768b9f88
Signed-off-by: Peter Collingbourne <pcc@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Alexander Potapenko <glider@google.com>
Acked-by: Marco Elver <elver@google.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Omar Sandoval <osandov@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
PAT handling won't do the right thing in COW mappings: the first PTE (or,
in fact, all PTEs) can be replaced during write faults to point at anon
folios. Reliably recovering the correct PFN and cachemode using
follow_phys() from PTEs will not work in COW mappings.
Using follow_phys(), we might just get the address+protection of the anon
folio (which is very wrong), or fail on swap/nonswap entries, failing
follow_phys() and triggering a WARN_ON_ONCE() in untrack_pfn() and
track_pfn_copy(), not properly calling free_pfn_range().
In free_pfn_range(), we either wouldn't call memtype_free() or would call
it with the wrong range, possibly leaking memory.
To fix that, let's update follow_phys() to refuse returning anon folios,
and fallback to using the stored PFN inside vma->vm_pgoff for COW mappings
if we run into that.
We will now properly handle untrack_pfn() with COW mappings, where we
don't need the cachemode. We'll have to fail fork()->track_pfn_copy() if
the first page was replaced by an anon folio, though: we'd have to store
the cachemode in the VMA to make this work, likely growing the VMA size.
For now, lets keep it simple and let track_pfn_copy() just fail in that
case: it would have failed in the past with swap/nonswap entries already,
and it would have done the wrong thing with anon folios.
Simple reproducer to trigger the WARN_ON_ONCE() in untrack_pfn():
<--- C reproducer --->
#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h>
#include <liburing.h>
int main(void)
{
struct io_uring_params p = {};
int ring_fd;
size_t size;
char *map;
ring_fd = io_uring_setup(1, &p);
if (ring_fd < 0) {
perror("io_uring_setup");
return 1;
}
size = p.sq_off.array + p.sq_entries * sizeof(unsigned);
/* Map the submission queue ring MAP_PRIVATE */
map = mmap(0, size, PROT_READ | PROT_WRITE, MAP_PRIVATE,
ring_fd, IORING_OFF_SQ_RING);
if (map == MAP_FAILED) {
perror("mmap");
return 1;
}
/* We have at least one page. Let's COW it. */
*map = 0;
pause();
return 0;
}
<--- C reproducer --->
On a system with 16 GiB RAM and swap configured:
# ./iouring &
# memhog 16G
# killall iouring
[ 301.552930] ------------[ cut here ]------------
[ 301.553285] WARNING: CPU: 7 PID: 1402 at arch/x86/mm/pat/memtype.c:1060 untrack_pfn+0xf4/0x100
[ 301.553989] Modules linked in: binfmt_misc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_g
[ 301.558232] CPU: 7 PID: 1402 Comm: iouring Not tainted 6.7.5-100.fc38.x86_64 #1
[ 301.558772] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebu4
[ 301.559569] RIP: 0010:untrack_pfn+0xf4/0x100
[ 301.559893] Code: 75 c4 eb cf 48 8b 43 10 8b a8 e8 00 00 00 3b 6b 28 74 b8 48 8b 7b 30 e8 ea 1a f7 000
[ 301.561189] RSP: 0018:ffffba2c0377fab8 EFLAGS: 00010282
[ 301.561590] RAX: 00000000ffffffea RBX: ffff9208c8ce9cc0 RCX: 000000010455e047
[ 301.562105] RDX: 07fffffff0eb1e0a RSI: 0000000000000000 RDI: ffff9208c391d200
[ 301.562628] RBP: 0000000000000000 R08: ffffba2c0377fab8 R09: 0000000000000000
[ 301.563145] R10: ffff9208d2292d50 R11: 0000000000000002 R12: 00007fea890e0000
[ 301.563669] R13: 0000000000000000 R14: ffffba2c0377fc08 R15: 0000000000000000
[ 301.564186] FS: 0000000000000000(0000) GS:ffff920c2fbc0000(0000) knlGS:0000000000000000
[ 301.564773] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 301.565197] CR2: 00007fea88ee8a20 CR3: 00000001033a8000 CR4: 0000000000750ef0
[ 301.565725] PKRU: 55555554
[ 301.565944] Call Trace:
[ 301.566148] <TASK>
[ 301.566325] ? untrack_pfn+0xf4/0x100
[ 301.566618] ? __warn+0x81/0x130
[ 301.566876] ? untrack_pfn+0xf4/0x100
[ 301.567163] ? report_bug+0x171/0x1a0
[ 301.567466] ? handle_bug+0x3c/0x80
[ 301.567743] ? exc_invalid_op+0x17/0x70
[ 301.568038] ? asm_exc_invalid_op+0x1a/0x20
[ 301.568363] ? untrack_pfn+0xf4/0x100
[ 301.568660] ? untrack_pfn+0x65/0x100
[ 301.568947] unmap_single_vma+0xa6/0xe0
[ 301.569247] unmap_vmas+0xb5/0x190
[ 301.569532] exit_mmap+0xec/0x340
[ 301.569801] __mmput+0x3e/0x130
[ 301.570051] do_exit+0x305/0xaf0
...
Link: https://lkml.kernel.org/r/20240403212131.929421-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Wupeng Ma <mawupeng1@huawei.com>
Closes: https://lkml.kernel.org/r/20240227122814.3781907-1-mawupeng1@huawei.com
Fixes: b1a86e15dc03 ("x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines")
Fixes: 5899329b1910 ("x86: PAT: implement track/untrack of pfnmap regions for x86 - v3")
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Update all remaining vmware.com email addresses to actual broadcom.com.
Add corresponding .mailmap entries for maintainers who contributed in the
past as the vmware.com address will start bouncing soon.
Maintainership update. Jeff Sipek has left VMware, Nick Shi will be
maintaining VMware PTP.
Link: https://lkml.kernel.org/r/20240402232334.33167-1-alexey.makhalov@broadcom.com
Signed-off-by: Alexey Makhalov <alexey.makhalov@broadcom.com>
Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
Acked-by: Ajay Kaher <ajay.kaher@broadcom.com>
Acked-by: Ronak Doshi <ronak.doshi@broadcom.com>
Acked-by: Nick Shi <nick.shi@broadcom.com>
Acked-by: Bryan Tan <bryan-bt.tan@broadcom.com>
Acked-by: Vishnu Dasa <vishnu.dasa@broadcom.com>
Acked-by: Vishal Bhakta <vishal.bhakta@broadcom.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Got a compilation error on Android for ffsl after 91b80cc5b39f
("selftests: mm: fix map_hugetlb failure on 64K page size systems")
included vm_util.h.
Link: https://lkml.kernel.org/r/20240329185814.16304-1-edliaw@google.com
Fixes: af605d26a8f2 ("selftests/mm: merge util.h into vm_util.h")
Signed-off-by: Edward Liaw <edliaw@google.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
A lockdep reports a possible deadlock in the find_vmap_area_exceed_addr_lock()
function:
============================================
WARNING: possible recursive locking detected
6.9.0-rc1-00060-ged3ccc57b108-dirty #6140 Not tainted
--------------------------------------------
drgn/455 is trying to acquire lock:
ffff0000c00131d0 (&vn->busy.lock/1){+.+.}-{2:2}, at: find_vmap_area_exceed_addr_lock+0x64/0x124
but task is already holding lock:
ffff0000c0011878 (&vn->busy.lock/1){+.+.}-{2:2}, at: find_vmap_area_exceed_addr_lock+0x64/0x124
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&vn->busy.lock/1);
lock(&vn->busy.lock/1);
*** DEADLOCK ***
indeed it can happen if the find_vmap_area_exceed_addr_lock() gets called
concurrently because it tries to acquire two nodes locks. It was done to
prevent removing a lowest VA found on a previous step.
To address this a lowest VA is found first without holding a node lock
where it resides. As a last step we check if a VA still there because it
can go away, if removed, proceed with next lowest.
[akpm@linux-foundation.org: fix comment typos, per Baoquan]
Link: https://lkml.kernel.org/r/20240328140330.4747-1-urezki@gmail.com
Fixes: 53becf32aec1 ("mm: vmalloc: support multiple nodes in vread_iter")
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Tested-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Omar Sandoval <osandov@fb.com>
Reported-by: Jens Axboe <axboe@kernel.dk>
Cc: Baoquan He <bhe@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
During the boot the s390 system triggers "spinlock bad magic" messages
if the spinlock debugging is enabled:
[ 0.465445] BUG: spinlock bad magic on CPU#0, swapper/0
[ 0.465490] lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
[ 0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
[ 0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
[ 0.466270] Call Trace:
[ 0.466470] [<00000000011f26c8>] dump_stack_lvl+0x98/0xd8
[ 0.466516] [<00000000001dcc6a>] do_raw_spin_lock+0x8a/0x108
[ 0.466545] [<000000000042146c>] find_vmap_area+0x6c/0x108
[ 0.466572] [<000000000042175a>] find_vm_area+0x22/0x40
[ 0.466597] [<000000000012f152>] __set_memory+0x132/0x150
[ 0.466624] [<0000000001cc0398>] vmem_map_init+0x40/0x118
[ 0.466651] [<0000000001cc0092>] paging_init+0x22/0x68
[ 0.466677] [<0000000001cbbed2>] setup_arch+0x52a/0x708
[ 0.466702] [<0000000001cb6140>] start_kernel+0x80/0x5c8
[ 0.466727] [<0000000000100036>] startup_continue+0x36/0x40
it happens because such system tries to access some vmap areas
whereas the vmalloc initialization is not even yet done:
[ 0.465490] lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
[ 0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
[ 0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
[ 0.466270] Call Trace:
[ 0.466470] dump_stack_lvl (lib/dump_stack.c:117)
[ 0.466516] do_raw_spin_lock (kernel/locking/spinlock_debug.c:87 kernel/locking/spinlock_debug.c:115)
[ 0.466545] find_vmap_area (mm/vmalloc.c:1059 mm/vmalloc.c:2364)
[ 0.466572] find_vm_area (mm/vmalloc.c:3150)
[ 0.466597] __set_memory (arch/s390/mm/pageattr.c:360 arch/s390/mm/pageattr.c:393)
[ 0.466624] vmem_map_init (./arch/s390/include/asm/set_memory.h:55 arch/s390/mm/vmem.c:660)
[ 0.466651] paging_init (arch/s390/mm/init.c:97)
[ 0.466677] setup_arch (arch/s390/kernel/setup.c:972)
[ 0.466702] start_kernel (init/main.c:899)
[ 0.466727] startup_continue (arch/s390/kernel/head64.S:35)
[ 0.466811] INFO: lockdep is turned off.
...
[ 0.718250] vmalloc init - busy lock init 0000000002871860
[ 0.718328] vmalloc init - busy lock init 00000000028731b8
Some background. It worked before because the lock that is in question
was statically defined and initialized. As of now, the locks and data
structures are initialized in the vmalloc_init() function.
To address that issue add the check whether the "vmap_initialized"
variable is set, if not find_vmap_area() bails out on entry returning NULL.
Link: https://lkml.kernel.org/r/20240323141544.4150-1-urezki@gmail.com
Fixes: 72210662c5a2 ("mm: vmalloc: offload free_vmap_area_lock lock")
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
If a member of a cpio archive for an initrd or initrams is larger than
2Gb, we'll eventually fail to write to that file when we get to that
limit, unless O_LARGEFILE is set.
The problem can be seen with this recipe, assuming that BLK_DEV_RAM
is not configured:
cd /tmp
dd if=/dev/zero of=BIGFILE bs=1048576 count=2200
echo BIGFILE | cpio -o -H newc -R root:root > initrd.img
kexec -l /boot/vmlinuz-$(uname -r) --initrd=initrd.img --reuse-cmdline
kexec -e
The console will show 'Initramfs unpacking failed: write error'. With
the patch, the error is gone.
Link: https://lkml.kernel.org/r/20240323152934.3307391-1-jsperbeck@google.com
Signed-off-by: John Sperbeck <jsperbeck@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
folio_is_secretmem() currently relies on secretmem folios being LRU
folios, to save some cycles.
However, folios might reside in a folio batch without the LRU flag set, or
temporarily have their LRU flag cleared. Consequently, the LRU flag is
unreliable for this purpose.
In particular, this is the case when secretmem_fault() allocates a fresh
page and calls filemap_add_folio()->folio_add_lru(). The folio might be
added to the per-cpu folio batch and won't get the LRU flag set until the
batch was drained using e.g., lru_add_drain().
Consequently, folio_is_secretmem() might not detect secretmem folios and
GUP-fast can succeed in grabbing a secretmem folio, crashing the kernel
when we would later try reading/writing to the folio, because the folio
has been unmapped from the directmap.
Fix it by removing that unreliable check.
Link: https://lkml.kernel.org/r/20240326143210.291116-2-david@redhat.com
Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: xingwei lee <xrivendell7@gmail.com>
Reported-by: yue sun <samsun1006219@gmail.com>
Closes: https://lore.kernel.org/lkml/CABOYnLyevJeravW=QrH0JUPYEcDN160aZFb7kwndm-J2rmz0HQ@mail.gmail.com/
Debugged-by: Miklos Szeredi <miklos@szeredi.hu>
Tested-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
list_del_init_careful() needs to be the last access to the wait queue
entry - it effectively unlocks access.
Previously, finish_wait() would see the empty list head and skip taking
the lock, and then we'd return - but the completion path would still
attempt to do the wakeup after the task_struct pointer had been
overwritten.
Fixes: 71eb6b6b0ba9 ("fs/aio: obey min_nr when doing wakeups")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/linux-fsdevel/CAHTA-ubfwwB51A5Wg5M6H_rPEQK9pNf8FkAGH=vr=FEkyRrtqw@mail.gmail.com/
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Link: https://lore.kernel.org/stable/20240331215212.522544-1-kent.overstreet%40linux.dev
Link: https://lore.kernel.org/r/20240331215212.522544-1-kent.overstreet@linux.dev
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add CPUID_LNX_5 to track cpufeatures' word 21, and add the appropriate
compile-time assert in KVM to prevent direct lookups on the features in
CPUID_LNX_5. KVM uses X86_FEATURE_* flags to manage guest CPUID, and so
must translate features that are scattered by Linux from the Linux-defined
bit to the hardware-defined bit, i.e. should never try to directly access
scattered features in guest CPUID.
Opportunistically add NR_CPUID_WORDS to enum cpuid_leafs, along with a
compile-time assert in KVM's CPUID infrastructure to ensure that future
additions update cpuid_leafs along with NCAPINTS.
No functional change intended.
Fixes: 7f274e609f3d ("x86/cpufeatures: Add new word for scattered features")
Cc: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
childregs represents the registers which are active for the new thread
in user context. For a kernel thread, childregs->gp is never used since
the kernel gp is not touched by switch_to. For a user mode helper, the
gp value can be observed in user space after execve or possibly by other
means.
[From the email thread]
The /* Kernel thread */ comment is somewhat inaccurate in that it is also used
for user_mode_helper threads, which exec a user process, e.g. /sbin/init or
when /proc/sys/kernel/core_pattern is a pipe. Such threads do not have
PF_KTHREAD set and are valid targets for ptrace etc. even before they exec.
childregs is the *user* context during syscall execution and it is observable
from userspace in at least five ways:
1. kernel_execve does not currently clear integer registers, so the starting
register state for PID 1 and other user processes started by the kernel has
sp = user stack, gp = kernel __global_pointer$, all other integer registers
zeroed by the memset in the patch comment.
This is a bug in its own right, but I'm unwilling to bet that it is the only
way to exploit the issue addressed by this patch.
2. ptrace(PTRACE_GETREGSET): you can PTRACE_ATTACH to a user_mode_helper thread
before it execs, but ptrace requires SIGSTOP to be delivered which can only
happen at user/kernel boundaries.
3. /proc/*/task/*/syscall: this is perfectly happy to read pt_regs for
user_mode_helpers before the exec completes, but gp is not one of the
registers it returns.
4. PERF_SAMPLE_REGS_USER: LOCKDOWN_PERF normally prevents access to kernel
addresses via PERF_SAMPLE_REGS_INTR, but due to this bug kernel addresses
are also exposed via PERF_SAMPLE_REGS_USER which is permitted under
LOCKDOWN_PERF. I have not attempted to write exploit code.
5. Much of the tracing infrastructure allows access to user registers. I have
not attempted to determine which forms of tracing allow access to user
registers without already allowing access to kernel registers.
Fixes: 7db91e57a0ac ("RISC-V: Task implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Stefan O'Rear <sorear@fastmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240327061258.2370291-1-sorear@fastmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
patch_map() uses fixmap mappings to circumvent the non-writability of
the kernel text mapping.
The __set_fixmap() function only flushes the current cpu tlb, it does
not emit an IPI so we must make sure that while we use a fixmap mapping,
the current task is not migrated on another cpu which could miss the
newly introduced fixmap mapping.
So in order to avoid any task migration, disable the preemption.
Reported-by: Andrea Parri <andrea@rivosinc.com>
Closes: https://lore.kernel.org/all/ZcS+GAaM25LXsBOl@andrea/
Reported-by: Andy Chiu <andy.chiu@sifive.com>
Closes: https://lore.kernel.org/linux-riscv/CABgGipUMz3Sffu-CkmeUB1dKVwVQ73+7=sgC45-m0AE9RCjOZg@mail.gmail.com/
Fixes: cad539baa48f ("riscv: implement a memset like function for text")
Fixes: 0ff7c3b33127 ("riscv: Use text_mutex instead of patch_lock")
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Acked-by: Puranjay Mohan <puranjay12@gmail.com>
Link: https://lore.kernel.org/r/20240326203017.310422-3-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The following warning appears when using ftrace:
[89855.443413] RCU not on for: arch_cpu_idle+0x0/0x1c
[89855.445640] WARNING: CPU: 5 PID: 0 at include/linux/trace_recursion.h:162 arch_ftrace_ops_list_func+0x208/0x228
[89855.445824] Modules linked in: xt_conntrack(E) nft_chain_nat(E) xt_MASQUERADE(E) nf_conntrack_netlink(E) xt_addrtype(E) nft_compat(E) nf_tables(E) nfnetlink(E) br_netfilter(E) cfg80211(E) nls_iso8859_1(E) ofpart(E) redboot(E) cmdlinepart(E) cfi_cmdset_0001(E) virtio_net(E) cfi_probe(E) cfi_util(E) 9pnet_virtio(E) gen_probe(E) net_failover(E) virtio_rng(E) failover(E) 9pnet(E) physmap(E) map_funcs(E) chipreg(E) mtd(E) uio_pdrv_genirq(E) uio(E) dm_multipath(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) drm(E) efi_pstore(E) backlight(E) ip_tables(E) x_tables(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) xor(E) async_tx(E) raid6_pq(E) raid1(E) raid0(E) virtio_blk(E)
[89855.451563] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G E 6.8.0-rc6ubuntu-defconfig #2
[89855.451726] Hardware name: riscv-virtio,qemu (DT)
[89855.451899] epc : arch_ftrace_ops_list_func+0x208/0x228
[89855.452016] ra : arch_ftrace_ops_list_func+0x208/0x228
[89855.452119] epc : ffffffff8016b216 ra : ffffffff8016b216 sp : ffffaf808090fdb0
[89855.452171] gp : ffffffff827c7680 tp : ffffaf808089ad40 t0 : ffffffff800c0dd8
[89855.452216] t1 : 0000000000000001 t2 : 0000000000000000 s0 : ffffaf808090fe30
[89855.452306] s1 : 0000000000000000 a0 : 0000000000000026 a1 : ffffffff82cd6ac8
[89855.452423] a2 : ffffffff800458c8 a3 : ffffaf80b1870640 a4 : 0000000000000000
[89855.452646] a5 : 0000000000000000 a6 : 00000000ffffffff a7 : ffffffffffffffff
[89855.452698] s2 : ffffffff82766872 s3 : ffffffff80004caa s4 : ffffffff80ebea90
[89855.452743] s5 : ffffaf808089bd40 s6 : 8000000a00006e00 s7 : 0000000000000008
[89855.452787] s8 : 0000000000002000 s9 : 0000000080043700 s10: 0000000000000000
[89855.452831] s11: 0000000000000000 t3 : 0000000000100000 t4 : 0000000000000064
[89855.452874] t5 : 000000000000000c t6 : ffffaf80b182dbfc
[89855.452929] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
[89855.453053] [<ffffffff8016b216>] arch_ftrace_ops_list_func+0x208/0x228
[89855.453191] [<ffffffff8000e082>] ftrace_call+0x8/0x22
[89855.453265] [<ffffffff800a149c>] do_idle+0x24c/0x2ca
[89855.453357] [<ffffffff8000da54>] return_to_handler+0x0/0x26
[89855.453429] [<ffffffff8000b716>] smp_callin+0x92/0xb6
[89855.453785] ---[ end trace 0000000000000000 ]---
To fix this, mark arch_cpu_idle() as noinstr, like it is done in commit
a9cbc1b471d2 ("s390/idle: mark arch_cpu_idle() noinstr").
Reported-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Closes: https://lore.kernel.org/linux-riscv/51f21b87-ebed-4411-afbc-c00d3dea2bab@yadro.com/
Fixes: cfbc4f81c9d0 ("riscv: Select ARCH_WANTS_NO_INSTR")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Tested-by: Andy Chiu <andy.chiu@sifive.com>
Acked-by: Puranjay Mohan <puranjay12@gmail.com>
Link: https://lore.kernel.org/r/20240326203017.310422-2-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Print the instruction dump with info instead of emergency level. The
unhandled signal message is only for informational purpose.
Fixes: b8a03a634129 ("riscv: add userland instruction dump to RISC-V splats")
Signed-off-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/mvmy1aegrhm.fsf@suse.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
In cases where the sof driver is unable to find the firmware and/or
topology file [1], it exits without releasing the i915 runtime
pm wakeref [2]. This results in dmesg warnings[3] during
suspend/resume or driver unbind. Add remove_late() to the failure path
of sof_init_environment so that i915 wakeref is released appropriately
[1]
[ 8.990366] sof-audio-pci-intel-mtl 0000:00:1f.3: SOF firmware and/or topology file not found.
[ 8.990396] sof-audio-pci-intel-mtl 0000:00:1f.3: Supported default profiles
[ 8.990398] sof-audio-pci-intel-mtl 0000:00:1f.3: - ipc type 1 (Requested):
[ 8.990399] sof-audio-pci-intel-mtl 0000:00:1f.3: Firmware file: intel/sof-ipc4/mtl/sof-mtl.ri
[ 8.990401] sof-audio-pci-intel-mtl 0000:00:1f.3: Topology file: intel/sof-ace-tplg/sof-mtl-rt711-2ch.tplg
[ 8.990402] sof-audio-pci-intel-mtl 0000:00:1f.3: Check if you have 'sof-firmware' package installed.
[ 8.990403] sof-audio-pci-intel-mtl 0000:00:1f.3: Optionally it can be manually downloaded from:
[ 8.990404] sof-audio-pci-intel-mtl 0000:00:1f.3: https://github.com/thesofproject/sof-bin/
[ 8.999088] sof-audio-pci-intel-mtl 0000:00:1f.3: error: sof_probe_work failed err: -2
[2]
ref_tracker: 0000:00:02.0@ffff9b8511b6a378 has 1/5 users at
track_intel_runtime_pm_wakeref.part.0+0x36/0x70 [i915]
__intel_runtime_pm_get+0x51/0xb0 [i915]
intel_runtime_pm_get+0x17/0x20 [i915]
intel_display_power_get+0x2f/0x70 [i915]
i915_audio_component_get_power+0x23/0x120 [i915]
snd_hdac_display_power+0x89/0x130 [snd_hda_core]
hda_codec_i915_init+0x3f/0x50 [snd_sof_intel_hda]
hda_dsp_probe_early+0x170/0x250 [snd_sof_intel_hda_common]
snd_sof_device_probe+0x224/0x320 [snd_sof]
sof_pci_probe+0x15b/0x220 [snd_sof_pci]
hda_pci_intel_probe+0x30/0x70 [snd_sof_intel_hda_common]
local_pci_probe+0x4c/0xb0
pci_device_probe+0xcc/0x250
really_probe+0x18e/0x420
__driver_probe_device+0x7e/0x170
driver_probe_device+0x23/0xa0
[3]
[ 484.105070] ------------[ cut here ]------------
[ 484.108238] thunderbolt 0000:00:0d.2: PM: pci_pm_suspend_late+0x0/0x50 returned 0 after 0 usecs
[ 484.117106] i915 0000:00:02.0: i915 raw-wakerefs=1 wakelocks=1 on cleanup
[ 484.792005] WARNING: CPU: 2 PID: 2405 at drivers/gpu/drm/i915/intel_runtime_pm.c:444 intel_runtime_pm_driver_release+0x6c/0x80
Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://github.com/thesofproject/linux/pull/4878
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://msgid.link/r/20240404184813.134566-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Commit 1b600da51073 ("PM: EM: Optimize em_cpu_energy() and remove division")
has added back map_util_perf() in em_cpu_energy() computation which has
been removed with the rework of scheduler/cpufreq interface.
This is wrong because sugov_effective_cpu_perf() already takes care of
mapping the utilization to a performance level.
Fixes: 1b600da51073 ("PM: EM: Optimize em_cpu_energy() and remove division")
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
When adding sanitization of the label, the path through
edge_detector_setup() that leads to debounce_setup() was overlooked.
A request taking this path does not allocate a new label and the
request label is freed twice when the request is released, resulting
in memory corruption.
Add label sanitization to debounce_setup().
Cc: stable@vger.kernel.org
Fixes: b34490879baa ("gpio: cdev: sanitize the label before requesting the interrupt")
Signed-off-by: Kent Gibson <warthog618@gmail.com>
[Bartosz: rebased on top of the fix for empty GPIO labels]
Co-developed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
I got multiple syzbot reports showing old bugs exposed
by BPF after commit 20f2505fb436 ("bpf: Try to avoid kzalloc
in cgroup/{s,g}etsockopt")
setsockopt() @optlen argument should be taken into account
before copying data.
BUG: KASAN: slab-out-of-bounds in copy_from_sockptr_offset include/linux/sockptr.h:49 [inline]
BUG: KASAN: slab-out-of-bounds in copy_from_sockptr include/linux/sockptr.h:55 [inline]
BUG: KASAN: slab-out-of-bounds in do_replace net/ipv4/netfilter/ip_tables.c:1111 [inline]
BUG: KASAN: slab-out-of-bounds in do_ipt_set_ctl+0x902/0x3dd0 net/ipv4/netfilter/ip_tables.c:1627
Read of size 96 at addr ffff88802cd73da0 by task syz-executor.4/7238
CPU: 1 PID: 7238 Comm: syz-executor.4 Not tainted 6.9.0-rc2-next-20240403-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
__asan_memcpy+0x29/0x70 mm/kasan/shadow.c:105
copy_from_sockptr_offset include/linux/sockptr.h:49 [inline]
copy_from_sockptr include/linux/sockptr.h:55 [inline]
do_replace net/ipv4/netfilter/ip_tables.c:1111 [inline]
do_ipt_set_ctl+0x902/0x3dd0 net/ipv4/netfilter/ip_tables.c:1627
nf_setsockopt+0x295/0x2c0 net/netfilter/nf_sockopt.c:101
do_sock_setsockopt+0x3af/0x720 net/socket.c:2311
__sys_setsockopt+0x1ae/0x250 net/socket.c:2334
__do_sys_setsockopt net/socket.c:2343 [inline]
__se_sys_setsockopt net/socket.c:2340 [inline]
__x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x72/0x7a
RIP: 0033:0x7fd22067dde9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fd21f9ff0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 00007fd2207abf80 RCX: 00007fd22067dde9
RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00007fd2206ca47a R08: 0000000000000001 R09: 0000000000000000
R10: 0000000020000880 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007fd2207abf80 R15: 00007ffd2d0170d8
</TASK>
Allocated by task 7238:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
__kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
kasan_kmalloc include/linux/kasan.h:211 [inline]
__do_kmalloc_node mm/slub.c:4069 [inline]
__kmalloc_noprof+0x200/0x410 mm/slub.c:4082
kmalloc_noprof include/linux/slab.h:664 [inline]
__cgroup_bpf_run_filter_setsockopt+0xd47/0x1050 kernel/bpf/cgroup.c:1869
do_sock_setsockopt+0x6b4/0x720 net/socket.c:2293
__sys_setsockopt+0x1ae/0x250 net/socket.c:2334
__do_sys_setsockopt net/socket.c:2343 [inline]
__se_sys_setsockopt net/socket.c:2340 [inline]
__x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x72/0x7a
The buggy address belongs to the object at ffff88802cd73da0
which belongs to the cache kmalloc-8 of size 8
The buggy address is located 0 bytes inside of
allocated 1-byte region [ffff88802cd73da0, ffff88802cd73da1)
The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88802cd73020 pfn:0x2cd73
flags: 0xfff80000000000(node=0|zone=1|lastcpupid=0xfff)
page_type: 0xffffefff(slab)
raw: 00fff80000000000 ffff888015041280 dead000000000100 dead000000000122
raw: ffff88802cd73020 000000008080007f 00000001ffffefff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 5103, tgid 2119833701 (syz-executor.4), ts 5103, free_ts 70804600828
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1490
prep_new_page mm/page_alloc.c:1498 [inline]
get_page_from_freelist+0x2e7e/0x2f40 mm/page_alloc.c:3454
__alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4712
__alloc_pages_node_noprof include/linux/gfp.h:244 [inline]
alloc_pages_node_noprof include/linux/gfp.h:271 [inline]
alloc_slab_page+0x5f/0x120 mm/slub.c:2249
allocate_slab+0x5a/0x2e0 mm/slub.c:2412
new_slab mm/slub.c:2465 [inline]
___slab_alloc+0xcd1/0x14b0 mm/slub.c:3615
__slab_alloc+0x58/0xa0 mm/slub.c:3705
__slab_alloc_node mm/slub.c:3758 [inline]
slab_alloc_node mm/slub.c:3936 [inline]
__do_kmalloc_node mm/slub.c:4068 [inline]
kmalloc_node_track_caller_noprof+0x286/0x450 mm/slub.c:4089
kstrdup+0x3a/0x80 mm/util.c:62
device_rename+0xb5/0x1b0 drivers/base/core.c:4558
dev_change_name+0x275/0x860 net/core/dev.c:1232
do_setlink+0xa4b/0x41f0 net/core/rtnetlink.c:2864
__rtnl_newlink net/core/rtnetlink.c:3680 [inline]
rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3727
rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6594
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
page last free pid 5146 tgid 5146 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
free_pages_prepare mm/page_alloc.c:1110 [inline]
free_unref_page+0xd3c/0xec0 mm/page_alloc.c:2617
discard_slab mm/slub.c:2511 [inline]
__put_partials+0xeb/0x130 mm/slub.c:2980
put_cpu_partial+0x17c/0x250 mm/slub.c:3055
__slab_free+0x2ea/0x3d0 mm/slub.c:4254
qlink_free mm/kasan/quarantine.c:163 [inline]
qlist_free_all+0x9e/0x140 mm/kasan/quarantine.c:179
kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
__kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:322
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slub.c:3888 [inline]
slab_alloc_node mm/slub.c:3948 [inline]
__do_kmalloc_node mm/slub.c:4068 [inline]
__kmalloc_node_noprof+0x1d7/0x450 mm/slub.c:4076
kmalloc_node_noprof include/linux/slab.h:681 [inline]
kvmalloc_node_noprof+0x72/0x190 mm/util.c:634
bucket_table_alloc lib/rhashtable.c:186 [inline]
rhashtable_rehash_alloc+0x9e/0x290 lib/rhashtable.c:367
rht_deferred_worker+0x4e1/0x2440 lib/rhashtable.c:427
process_one_work kernel/workqueue.c:3218 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299
worker_thread+0x86d/0xd70 kernel/workqueue.c:3380
kthread+0x2f0/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
Memory state around the buggy address:
ffff88802cd73c80: 07 fc fc fc 05 fc fc fc 05 fc fc fc fa fc fc fc
ffff88802cd73d00: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc
>ffff88802cd73d80: fa fc fc fc 01 fc fc fc fa fc fc fc fa fc fc fc
^
ffff88802cd73e00: fa fc fc fc fa fc fc fc 05 fc fc fc 07 fc fc fc
ffff88802cd73e80: 07 fc fc fc 07 fc fc fc 07 fc fc fc 07 fc fc fc
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://lore.kernel.org/r/20240404122051.2303764-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot found that tcf_skbmod_dump() was copying four bytes
from kernel stack to user space [1].
The issue here is that 'struct tc_skbmod' has a four bytes hole.
We need to clear the structure before filling fields.
[1]
BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline]
BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline]
BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline]
BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x366/0x2520 lib/iov_iter.c:185
instrument_copy_to_user include/linux/instrumented.h:114 [inline]
copy_to_user_iter lib/iov_iter.c:24 [inline]
iterate_ubuf include/linux/iov_iter.h:29 [inline]
iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
iterate_and_advance include/linux/iov_iter.h:271 [inline]
_copy_to_iter+0x366/0x2520 lib/iov_iter.c:185
copy_to_iter include/linux/uio.h:196 [inline]
simple_copy_to_iter net/core/datagram.c:532 [inline]
__skb_datagram_iter+0x185/0x1000 net/core/datagram.c:420
skb_copy_datagram_iter+0x5c/0x200 net/core/datagram.c:546
skb_copy_datagram_msg include/linux/skbuff.h:4050 [inline]
netlink_recvmsg+0x432/0x1610 net/netlink/af_netlink.c:1962
sock_recvmsg_nosec net/socket.c:1046 [inline]
sock_recvmsg+0x2c4/0x340 net/socket.c:1068
__sys_recvfrom+0x35a/0x5f0 net/socket.c:2242
__do_sys_recvfrom net/socket.c:2260 [inline]
__se_sys_recvfrom net/socket.c:2256 [inline]
__x64_sys_recvfrom+0x126/0x1d0 net/socket.c:2256
do_syscall_64+0xd5/0x1f0
entry_SYSCALL_64_after_hwframe+0x6d/0x75
Uninit was stored to memory at:
pskb_expand_head+0x30f/0x19d0 net/core/skbuff.c:2253
netlink_trim+0x2c2/0x330 net/netlink/af_netlink.c:1317
netlink_unicast+0x9f/0x1260 net/netlink/af_netlink.c:1351
nlmsg_unicast include/net/netlink.h:1144 [inline]
nlmsg_notify+0x21d/0x2f0 net/netlink/af_netlink.c:2610
rtnetlink_send+0x73/0x90 net/core/rtnetlink.c:741
rtnetlink_maybe_send include/linux/rtnetlink.h:17 [inline]
tcf_add_notify net/sched/act_api.c:2048 [inline]
tcf_action_add net/sched/act_api.c:2071 [inline]
tc_ctl_action+0x146e/0x19d0 net/sched/act_api.c:2119
rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595
netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559
rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613
netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361
netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x30f/0x380 net/socket.c:745
____sys_sendmsg+0x877/0xb60 net/socket.c:2584
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
__sys_sendmsg net/socket.c:2667 [inline]
__do_sys_sendmsg net/socket.c:2676 [inline]
__se_sys_sendmsg net/socket.c:2674 [inline]
__x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674
do_syscall_64+0xd5/0x1f0
entry_SYSCALL_64_after_hwframe+0x6d/0x75
Uninit was stored to memory at:
__nla_put lib/nlattr.c:1041 [inline]
nla_put+0x1c6/0x230 lib/nlattr.c:1099
tcf_skbmod_dump+0x23f/0xc20 net/sched/act_skbmod.c:256
tcf_action_dump_old net/sched/act_api.c:1191 [inline]
tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227
tcf_action_dump+0x1fd/0x460 net/sched/act_api.c:1251
tca_get_fill+0x519/0x7a0 net/sched/act_api.c:1628
tcf_add_notify_msg net/sched/act_api.c:2023 [inline]
tcf_add_notify net/sched/act_api.c:2042 [inline]
tcf_action_add net/sched/act_api.c:2071 [inline]
tc_ctl_action+0x1365/0x19d0 net/sched/act_api.c:2119
rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595
netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559
rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613
netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361
netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x30f/0x380 net/socket.c:745
____sys_sendmsg+0x877/0xb60 net/socket.c:2584
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
__sys_sendmsg net/socket.c:2667 [inline]
__do_sys_sendmsg net/socket.c:2676 [inline]
__se_sys_sendmsg net/socket.c:2674 [inline]
__x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674
do_syscall_64+0xd5/0x1f0
entry_SYSCALL_64_after_hwframe+0x6d/0x75
Local variable opt created at:
tcf_skbmod_dump+0x9d/0xc20 net/sched/act_skbmod.c:244
tcf_action_dump_old net/sched/act_api.c:1191 [inline]
tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227
Bytes 188-191 of 248 are uninitialized
Memory access of size 248 starts at ffff888117697680
Data copied to user address 00007ffe56d855f0
Fixes: 86da71b57383 ("net_sched: Introduce skbmod action")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20240403130908.93421-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After the commit d2689b6a86b9 ("net: usb: ax88179_178a: avoid two
consecutive device resets"), reset is not executed from bind operation and
mac address is not read from the device registers or the devicetree at that
moment. Since the check to configure if the assigned mac address is random
or not for the interface, happens after the bind operation from
usbnet_probe, the interface keeps configured as random address, although the
address is correctly read and set during open operation (the only reset
now).
In order to keep only one reset for the device and to avoid the interface
always configured as random address, after reset, configure correctly the
suitable field from the driver, if the mac address is read successfully from
the device registers or the devicetree. Take into account if a locally
administered address (random) was previously stored.
cc: stable@vger.kernel.org # 6.6+
Fixes: d2689b6a86b9 ("net: usb: ax88179_178a: avoid two consecutive device resets")
Reported-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240403132158.344838-1-jtornosm@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rename nvme_fc_nvme_ctrl_freed to nvme_fc_free_ctrl to match the name
pattern for the callback.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
The RCU lock is only needed for the lookup loop and not for
list_ad_tail_rcu call. Thus move it down the call chain into
nvmet_fc_assoc_exists.
While at it also fix the name typo of the function.
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Unique discovery NQNs allow to differentiate between discovery
services from (typically physically separate) NVMe-oF subsystems.
This is required for establishing secured connections as otherwise
the credentials won't be unique and the integrity of the connection
cannot be guaranteed.
This patch adds a configfs attribute 'discovery_nqn' in the 'nvmet'
configfs directory to specify the unique discovery NQN.
Signed-off-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Apparently there are nvme controllers around that report namespaces
in the namespace list which have zero capacity. Return -ENXIO instead
of -ENODEV from nvme_update_ns_info_block so we don't create a hidden
multipath node for these namespaces but entirely ignore them.
Fixes: 46e7422cda84 ("nvme: move common logic into nvme_update_ns_info")
Reported-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Modifying a MCA bank's MCA_CTL bits which control which error types to
be reported is done over
/sys/devices/system/machinecheck/
├── machinecheck0
│ ├── bank0
│ ├── bank1
│ ├── bank10
│ ├── bank11
...
sysfs nodes by writing the new bit mask of events to enable.
When the write is accepted, the kernel deletes all current timers and
reinits all banks.
Doing that in parallel can lead to initializing a timer which is already
armed and in the timer wheel, i.e., in use already:
ODEBUG: init active (active state 0) object: ffff888063a28000 object
type: timer_list hint: mce_timer_fn+0x0/0x240 arch/x86/kernel/cpu/mce/core.c:2642
WARNING: CPU: 0 PID: 8120 at lib/debugobjects.c:514
debug_print_object+0x1a0/0x2a0 lib/debugobjects.c:514
Fix that by grabbing the sysfs mutex as the rest of the MCA sysfs code
does.
Reported by: Yue Sun <samsun1006219@gmail.com>
Reported by: xingwei lee <xrivendell7@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/CAEkJfYNiENwQY8yV1LYJ9LjJs%2Bx_-PqMv98gKig55=2vbzffRw@mail.gmail.com
|
|
We need to take into account that a line's consumer label may be NULL
and not try to kstrdup() it in that case but rather pass the NULL
pointer up the stack to the interrupt request function.
To that end: let make_irq_label() return NULL as a valid return value
and use ERR_PTR() instead to signal an allocation failure to callers.
Cc: stable@vger.kernel.org
Fixes: b34490879baa ("gpio: cdev: sanitize the label before requesting the interrupt")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/lkml/20240402093534.212283-1-naresh.kamboju@linaro.org/
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
|
|
Preempt fences can sleep waiting for an exec queue suspend operation to
complete. If the system_unbound_wq is used for waiting and the number of
waiters exceeds max_active this will result in other users of the
system_unbound_wq getting starved. Use a device private work queue for
preempt fences to avoid starvation of the system_unbound_wq.
Even though suspend operations can complete out-of-order, all suspend
operations within a VM need to complete before the preempt rebind worker
can start. With that, use a device private ordered wq for preempt fence
waiting.
v2:
- Add comment about cleanup on failure (Matt R)
- Update commit message (Lucas)
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240401221913.139672-2-matthew.brost@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 37c15c4aae1fe3f67efd2641db8d8c25c2d524ab)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Rebinding might allocate page-table bos, causing evictions.
To support blocking locking during these evictions,
perform the rebinding in the drm_exec locking loop.
Also Reserve fence slots where actually needed rather than trying to
predict how many fence slots will be needed over a complete
wound-wait transaction.
v2:
- Remove a leftover call to xe_vm_rebind() (Matt Brost)
- Add a helper function xe_vm_validate_rebind() (Matt Brost)
v3:
- Add comments and squash with previous patch (Matt Brost)
Fixes: 24f947d58fe5 ("drm/xe: Use DRM GPUVM helpers for external- and evicted objects")
Fixes: 29f424eb8702 ("drm/xe/exec: move fence reservation")
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-5-thomas.hellstrom@linux.intel.com
(cherry picked from commit 7ee7dd6f301341d5b1204fc19fa620d7f7f7e90d)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
They can actually complete out-of-order, so allocate a unique
fence context for each fence.
Fixes: 5387e865d90e ("drm/xe: Add TLB invalidation fence after rebinds issued from execs")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-4-thomas.hellstrom@linux.intel.com
(cherry picked from commit 0453f1757501df2e82b66b3183a24bba5a6f8fa3)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Instead of handling the vm's rebind fence separately,
which is error prone if they are not strictly ordered,
attach rebind fences as kernel fences to the vm's resv.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-3-thomas.hellstrom@linux.intel.com
(cherry picked from commit 5a091aff50b780ae29c7faf70a7a6c21c98a54c4)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
For each rebind we insert a GuC TLB invalidation and add a
corresponding unordered TLB invalidation fence. This might
add a huge number of TLB invalidation fences to wait for so
rather than doing that, defer the TLB invalidation to the
next ring ops for each affected exec queue. Since the TLB
is invalidated on exec_queue switch, we need to invalidate
once for each affected exec_queue.
v2:
- Simplify if-statements around the tlb_flush_seqno.
(Matthew Brost)
- Add some comments and asserts.
Fixes: 5387e865d90e ("drm/xe: Add TLB invalidation fence after rebinds issued from execs")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-2-thomas.hellstrom@linux.intel.com
(cherry picked from commit 4fc4899e86f7afbd09f4bcb899f0fc57e0296e62)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Before ACP firmware loading, DSP interrupts are not expected.
Sometimes after reboot, it's observed that before ACP firmware is loaded
false DSP interrupt is reported.
Registering the interrupt handler before acp initialization causing false
interrupts sometimes on reboot as ACP reset is not applied.
Correct the sequence by invoking acp initialization sequence prior to
registering interrupt handler.
Fixes: 738a2b5e2cc9 ("ASoC: SOF: amd: Add IPC support for ACP IP block")
Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Link: https://msgid.link/r/20240404041717.430545-1-Vijendar.Mukunda@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The definition and declaration of sja1110_pcs_mdio_write_c45() don't have
parameters in the same order.
Knowing that sja1110_pcs_mdio_write_c45() is used as a function pointer
in 'sja1105_info' structure with .pcs_mdio_write_c45, and that we have:
int (*pcs_mdio_write_c45)(struct mii_bus *bus, int phy, int mmd,
int reg, u16 val);
it is likely that the definition is the one to change.
Found with cppcheck, funcArgOrderDifferent.
Fixes: ae271547bba6 ("net: dsa: sja1105: C45 only transactions for PCS")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Michael Walle <mwalle@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/ff2a5af67361988b3581831f7bd1eddebfb4c48f.1712082763.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The error statistics should be updated each time the poll function is
called, even if the full RX work budget has been consumed. This prevents
the counts from becoming stuck when RX bandwidth usage is high.
This also ensures that error counters are not updated after we've
re-enabled interrupts as that could result in a race condition.
Also drop an unnecessary space.
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/20240402145305.82148-2-paul.barker.ct@bp.renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The TX queue should be serviced each time the poll function is called,
even if the full RX work budget has been consumed. This prevents
starvation of the TX queue when RX bandwidth usage is high.
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/20240402145305.82148-1-paul.barker.ct@bp.renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Hook unregistration is deferred to the commit phase, same occurs with
hook updates triggered by the table dormant flag. When both commands are
combined, this results in deleting a basechain while leaving its hook
still registered in the core.
Fixes: 179d9ba5559a ("netfilter: nf_tables: fix table flag updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
nft_unregister_flowtable_type() within nf_flow_inet_module_exit() can
concurrent with __nft_flowtable_type_get() within nf_tables_newflowtable().
And thhere is not any protection when iterate over nf_tables_flowtables
list in __nft_flowtable_type_get(). Therefore, there is pertential
data-race of nf_tables_flowtables list entry.
Use list_for_each_entry_rcu() to iterate over nf_tables_flowtables list
in __nft_flowtable_type_get(), and use rcu_read_lock() in the caller
nft_flowtable_type_get() to protect the entire type query process.
Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When dormant flag is toggled, hooks are disabled in the commit phase by
iterating over current chains in table (existing and new).
The following configuration allows for an inconsistent state:
add table x
add chain x y { type filter hook input priority 0; }
add table x { flags dormant; }
add chain x w { type filter hook input priority 1; }
which triggers the following warning when trying to unregister chain w
which is already unregistered.
[ 127.322252] WARNING: CPU: 7 PID: 1211 at net/netfilter/core.c:50 1 __nf_unregister_net_hook+0x21a/0x260
[...]
[ 127.322519] Call Trace:
[ 127.322521] <TASK>
[ 127.322524] ? __warn+0x9f/0x1a0
[ 127.322531] ? __nf_unregister_net_hook+0x21a/0x260
[ 127.322537] ? report_bug+0x1b1/0x1e0
[ 127.322545] ? handle_bug+0x3c/0x70
[ 127.322552] ? exc_invalid_op+0x17/0x40
[ 127.322556] ? asm_exc_invalid_op+0x1a/0x20
[ 127.322563] ? kasan_save_free_info+0x3b/0x60
[ 127.322570] ? __nf_unregister_net_hook+0x6a/0x260
[ 127.322577] ? __nf_unregister_net_hook+0x21a/0x260
[ 127.322583] ? __nf_unregister_net_hook+0x6a/0x260
[ 127.322590] ? __nf_tables_unregister_hook+0x8a/0xe0 [nf_tables]
[ 127.322655] nft_table_disable+0x75/0xf0 [nf_tables]
[ 127.322717] nf_tables_commit+0x2571/0x2620 [nf_tables]
Fixes: 179d9ba5559a ("netfilter: nf_tables: fix table flag updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Similar to 2c9f0293280e ("netfilter: nf_tables: flush pending destroy
work before netlink notifier") to address a race between exit_net and
the destroy workqueue.
The trace below shows an element to be released via destroy workqueue
while exit_net path (triggered via module removal) has already released
the set that is used in such transaction.
[ 1360.547789] BUG: KASAN: slab-use-after-free in nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
[ 1360.547861] Read of size 8 at addr ffff888140500cc0 by task kworker/4:1/152465
[ 1360.547870] CPU: 4 PID: 152465 Comm: kworker/4:1 Not tainted 6.8.0+ #359
[ 1360.547882] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
[ 1360.547984] Call Trace:
[ 1360.547991] <TASK>
[ 1360.547998] dump_stack_lvl+0x53/0x70
[ 1360.548014] print_report+0xc4/0x610
[ 1360.548026] ? __virt_addr_valid+0xba/0x160
[ 1360.548040] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 1360.548054] ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
[ 1360.548176] kasan_report+0xae/0xe0
[ 1360.548189] ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
[ 1360.548312] nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
[ 1360.548447] ? __pfx_nf_tables_trans_destroy_work+0x10/0x10 [nf_tables]
[ 1360.548577] ? _raw_spin_unlock_irq+0x18/0x30
[ 1360.548591] process_one_work+0x2f1/0x670
[ 1360.548610] worker_thread+0x4d3/0x760
[ 1360.548627] ? __pfx_worker_thread+0x10/0x10
[ 1360.548640] kthread+0x16b/0x1b0
[ 1360.548653] ? __pfx_kthread+0x10/0x10
[ 1360.548665] ret_from_fork+0x2f/0x50
[ 1360.548679] ? __pfx_kthread+0x10/0x10
[ 1360.548690] ret_from_fork_asm+0x1a/0x30
[ 1360.548707] </TASK>
[ 1360.548719] Allocated by task 192061:
[ 1360.548726] kasan_save_stack+0x20/0x40
[ 1360.548739] kasan_save_track+0x14/0x30
[ 1360.548750] __kasan_kmalloc+0x8f/0xa0
[ 1360.548760] __kmalloc_node+0x1f1/0x450
[ 1360.548771] nf_tables_newset+0x10c7/0x1b50 [nf_tables]
[ 1360.548883] nfnetlink_rcv_batch+0xbc4/0xdc0 [nfnetlink]
[ 1360.548909] nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink]
[ 1360.548927] netlink_unicast+0x367/0x4f0
[ 1360.548935] netlink_sendmsg+0x34b/0x610
[ 1360.548944] ____sys_sendmsg+0x4d4/0x510
[ 1360.548953] ___sys_sendmsg+0xc9/0x120
[ 1360.548961] __sys_sendmsg+0xbe/0x140
[ 1360.548971] do_syscall_64+0x55/0x120
[ 1360.548982] entry_SYSCALL_64_after_hwframe+0x55/0x5d
[ 1360.548994] Freed by task 192222:
[ 1360.548999] kasan_save_stack+0x20/0x40
[ 1360.549009] kasan_save_track+0x14/0x30
[ 1360.549019] kasan_save_free_info+0x3b/0x60
[ 1360.549028] poison_slab_object+0x100/0x180
[ 1360.549036] __kasan_slab_free+0x14/0x30
[ 1360.549042] kfree+0xb6/0x260
[ 1360.549049] __nft_release_table+0x473/0x6a0 [nf_tables]
[ 1360.549131] nf_tables_exit_net+0x170/0x240 [nf_tables]
[ 1360.549221] ops_exit_list+0x50/0xa0
[ 1360.549229] free_exit_list+0x101/0x140
[ 1360.549236] unregister_pernet_operations+0x107/0x160
[ 1360.549245] unregister_pernet_subsys+0x1c/0x30
[ 1360.549254] nf_tables_module_exit+0x43/0x80 [nf_tables]
[ 1360.549345] __do_sys_delete_module+0x253/0x370
[ 1360.549352] do_syscall_64+0x55/0x120
[ 1360.549360] entry_SYSCALL_64_after_hwframe+0x55/0x5d
(gdb) list *__nft_release_table+0x473
0x1e033 is in __nft_release_table (net/netfilter/nf_tables_api.c:11354).
11349 list_for_each_entry_safe(flowtable, nf, &table->flowtables, list) {
11350 list_del(&flowtable->list);
11351 nft_use_dec(&table->use);
11352 nf_tables_flowtable_destroy(flowtable);
11353 }
11354 list_for_each_entry_safe(set, ns, &table->sets, list) {
11355 list_del(&set->list);
11356 nft_use_dec(&table->use);
11357 if (set->flags & (NFT_SET_MAP | NFT_SET_OBJECT))
11358 nft_map_deactivate(&ctx, set);
(gdb)
[ 1360.549372] Last potentially related work creation:
[ 1360.549376] kasan_save_stack+0x20/0x40
[ 1360.549384] __kasan_record_aux_stack+0x9b/0xb0
[ 1360.549392] __queue_work+0x3fb/0x780
[ 1360.549399] queue_work_on+0x4f/0x60
[ 1360.549407] nft_rhash_remove+0x33b/0x340 [nf_tables]
[ 1360.549516] nf_tables_commit+0x1c6a/0x2620 [nf_tables]
[ 1360.549625] nfnetlink_rcv_batch+0x728/0xdc0 [nfnetlink]
[ 1360.549647] nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink]
[ 1360.549671] netlink_unicast+0x367/0x4f0
[ 1360.549680] netlink_sendmsg+0x34b/0x610
[ 1360.549690] ____sys_sendmsg+0x4d4/0x510
[ 1360.549697] ___sys_sendmsg+0xc9/0x120
[ 1360.549706] __sys_sendmsg+0xbe/0x140
[ 1360.549715] do_syscall_64+0x55/0x120
[ 1360.549725] entry_SYSCALL_64_after_hwframe+0x55/0x5d
Fixes: 0935d5588400 ("netfilter: nf_tables: asynchronous release")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The commit mutex should not be released during the critical section
between nft_gc_seq_begin() and nft_gc_seq_end(), otherwise, async GC
worker could collect expired objects and get the released commit lock
within the same GC sequence.
nf_tables_module_autoload() temporarily releases the mutex to load
module dependencies, then it goes back to replay the transaction again.
Move it at the end of the abort phase after nft_gc_seq_end() is called.
Cc: stable@vger.kernel.org
Fixes: 720344340fb9 ("netfilter: nf_tables: GC transaction race with abort path")
Reported-by: Kuan-Ting Chen <hexrabbit@devco.re>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Unlike early commit path stage which triggers a call to abort, an
explicit release of the batch is required on abort, otherwise mutex is
released and commit_list remains in place.
Add WARN_ON_ONCE to ensure commit_list is empty from the abort path
before releasing the mutex.
After this patch, commit_list is always assumed to be empty before
grabbing the mutex, therefore
03c1f1ef1584 ("netfilter: Cleanup nft_net->module_list from nf_tables_exit_net()")
only needs to release the pending modules for registration.
Cc: stable@vger.kernel.org
Fixes: c0391b6ab810 ("netfilter: nf_tables: missing validation from the abort path")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This reverts commit 9ab4ad295622a3481818856762471c1f8c830e18.
I went out of coffee and applied it to the wrong tree. Blame on me.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The host SNP worthiness can determined later, after alternatives have
been patched, in snp_rmptable_init() depending on cmdline options like
iommu=pt which is incompatible with SNP, for example.
Which means that one cannot use X86_FEATURE_SEV_SNP and will need to
have a special flag for that control.
Use that newly added CC_ATTR_HOST_SEV_SNP in the appropriate places.
Move kdump_sev_callback() to its rightful place, while at it.
Fixes: 216d106c7ff7 ("x86/sev: Add SEV-SNP host initialization support")
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Srikanth Aithal <sraithal@amd.com>
Link: https://lore.kernel.org/r/20240327154317.29909-6-bp@alien8.de
|
|
Add functionality to set and/or clear different attributes of the
machine as a confidential computing platform. Add the first one too:
whether the machine is running as a host for SEV-SNP guests.
Fixes: 216d106c7ff7 ("x86/sev: Add SEV-SNP host initialization support")
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Srikanth Aithal <sraithal@amd.com>
Link: https://lore.kernel.org/r/20240327154317.29909-5-bp@alien8.de
|
|
The functionality to load SEV-SNP guests by the host will soon rely on
cc_platform* helpers because the cpu_feature* API with the early
patching is insufficient when SNP support needs to be disabled late.
Therefore, pull that functionality in.
Fixes: 216d106c7ff7 ("x86/sev: Add SEV-SNP host initialization support")
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Srikanth Aithal <sraithal@amd.com>
Link: https://lore.kernel.org/r/20240327154317.29909-4-bp@alien8.de
|
|
There are few uses of CoCo that don't rely on working cryptography and
hence a working RNG. Unfortunately, the CoCo threat model means that the
VM host cannot be trusted and may actively work against guests to
extract secrets or manipulate computation. Since a malicious host can
modify or observe nearly all inputs to guests, the only remaining source
of entropy for CoCo guests is RDRAND.
If RDRAND is broken -- due to CPU hardware fault -- the RNG as a whole
is meant to gracefully continue on gathering entropy from other sources,
but since there aren't other sources on CoCo, this is catastrophic.
This is mostly a concern at boot time when initially seeding the RNG, as
after that the consequences of a broken RDRAND are much more
theoretical.
So, try at boot to seed the RNG using 256 bits of RDRAND output. If this
fails, panic(). This will also trigger if the system is booted without
RDRAND, as RDRAND is essential for a safe CoCo boot.
Add this deliberately to be "just a CoCo x86 driver feature" and not
part of the RNG itself. Many device drivers and platforms have some
desire to contribute something to the RNG, and add_device_randomness()
is specifically meant for this purpose.
Any driver can call it with seed data of any quality, or even garbage
quality, and it can only possibly make the quality of the RNG better or
have no effect, but can never make it worse.
Rather than trying to build something into the core of the RNG, consider
the particular CoCo issue just a CoCo issue, and therefore separate it
all out into driver (well, arch/platform) code.
[ bp: Massage commit message. ]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Elena Reshetova <elena.reshetova@intel.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240326160735.73531-1-Jason@zx2c4.com
|
|
As of now, tg3_power_down_prepare always ends with success, but
the error handling code from former tg3_set_power_state call is still here.
This code became unreachable in commit c866b7eac073 ("tg3: Do not use
legacy PCI power management").
Remove (now unreachable) error handling code for simplification and change
tg3_power_down_prepare to a void function as its result is no more checked.
Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240401191418.361747-1-kiryushin@ancud.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The __vmalloc_start_set declaration is in a header that is not included
in numa_32.c in current linux-next:
arch/x86/mm/numa_32.c: In function 'initmem_init':
arch/x86/mm/numa_32.c:57:9: error: '__vmalloc_start_set' undeclared (first use in this function)
57 | __vmalloc_start_set = true;
| ^~~~~~~~~~~~~~~~~~~
arch/x86/mm/numa_32.c:57:9: note: each undeclared identifier is reported only once for each function it appears in
Add an explicit #include.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240403202344.3463169-1-arnd@kernel.org
|
|
The call to clk_enable() in gemini_sata_start_bridge() can fail.
Add a check to detect such failure.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|
|
Building with W=1 shows a warning for an unused variable when CONFIG_PCI
is diabled:
drivers/ata/sata_mv.c:790:35: error: unused variable 'mv_pci_tbl' [-Werror,-Wunused-const-variable]
static const struct pci_device_id mv_pci_tbl[] = {
Move the table into the same block that containsn the pci_driver
definition.
Fixes: 7bb3c5290ca0 ("sata_mv: Remove PCI dependency")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
|