summaryrefslogtreecommitdiffstats
path: root/tools/perf/util/machine.c (unfollow)
Commit message (Collapse)AuthorFilesLines
35 hourstools/power turbostat: Add RAPL psys as a built-in counterPatryk Wlazlyn2-10/+85
Introduce the counter as a part of global, platform counters structure. We open the counter for only one cpu, but otherwise treat it as an ordinary RAPL counter, allowing for grouped perf read. The counter is disabled by default, because it's interpretation may require additional, platform specific information, making it unsuitable for general use. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Fix child's argument forwardingPatryk Wlazlyn1-1/+1
Add '+' to optstring when early scanning for --no-msr and --no-perf. It causes option processing to stop as soon as a nonoption argument is encountered, effectively skipping child's arguments. Fixes: 3e4048466c39 ("tools/power turbostat: Add --no-msr option") Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Force --no-perf in --dump modePatryk Wlazlyn1-0/+6
Force the --no-perf early to prevent using it as a source. User asks for raw values, but perf returns them relative to the opening of the file descriptor. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Add support for /sys/class/drm/card1Zhang Rui1-9/+29
On some machines, the graphics device is enumerated as /sys/class/drm/card1 instead of /sys/class/drm/card0. The current implementation does not handle this scenario, resulting in the loss of graphics C6 residency and frequency information. Add support for /sys/class/drm/card1, ensuring that turbostat can retrieve and display the graphics columns for these platforms. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Cache graphics sysfs file descriptors during probeZhang Rui1-50/+32
Snapshots of the graphics sysfs knobs are taken based on file descriptors. To optimize this process, open the files and cache the file descriptors during the graphics probe phase. As a result, the previously cached pathnames become redundant and are removed. This change aims to streamline the code without altering its functionality. No functional change intended. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Consolidate graphics sysfs accessZhang Rui1-9/+6
Currently, there is an inconsistency in how graphics sysfs knobs are accessed: graphics residency sysfs knobs are opened and closed for each read, while graphics frequency sysfs knobs are opened once and remain open until turbostat exits. This inconsistency is confusing and adds unnecessary code complexity. Consolidate the access method by opening the sysfs files once and reusing the file pointers for subsequent accesses. This approach simplifies the code and ensures a consistent method for accessing graphics sysfs knobs. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Remove unnecessary fflush() callZhang Rui1-4/+3
The graphics sysfs knobs are read-only, making the use of fflush() before reading them redundant. Remove the unnecessary fflush() call. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Enhance platform divergence descriptionZhang Rui1-28/+30
In various generations, platforms often share a majority of features, diverging only in a few specific aspects. The current approach of using hardcoded values in 'platform_features' structure fails to effectively represent these divergences. To improve the description of platform divergence: 1. Each newly introduced 'platform_features' structure must have a base, typically derived from the previous generation. 2. Platform feature values should be inherited from the base structure rather than being hardcoded. This approach ensures a more accurate and maintainable representation of platform-specific features across different generations. Converts `adl_features` and `lnl_features` to follow this new scheme. No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Add initial support for GraniteRapids-DZhang Rui1-0/+1
Add initial support for GraniteRapids-D. It shares the same features with SapphireRapids. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Remove PC3 support on LunarlakeZhang Rui1-1/+1
Lunarlake supports CC1/CC6/CC7/PC2/PC6/PC10. Remove PC3 support on Lunarlake. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Rename arl_features to lnl_featuresZhang Rui1-2/+2
As ARL shares the same features with ADL/RPL/MTL, now 'arl_features' is used by Lunarlake platform only. Rename 'arl_features' to 'lnl_features'. No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Add back PC8 support on ArrowlakeZhang Rui1-3/+3
Similar to ADL/RPL/MTL, ARL supports CC1/CC6/CC7/PC2/PC3/PC6/PC8/PC10. Add back PC8 support on Arrowlake. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Remove PC7/PC9 support on MTLZhang Rui1-2/+2
Similar to ADL/RPL, MTL support CC1/CC6/CC7/PC2/PC3/PC6/PC8/CP10. Remove PC7/PC9 support on MTL. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Honor --show CPU, even when even when num_cpus=1Patryk Wlazlyn1-2/+2
Honor --show CPU and --show Core when "topo.num_cpus == 1". Previously turbostat assumed that on a 1-CPU system, these columns should never appear. Honoring these flags makes it easier for several programs that parse turbostat output. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Fix trailing '\n' parsingZhang Rui1-0/+3
parse_cpu_string() parses the string input either from command line or from /sys/fs/cgroup/cpuset.cpus.effective to get a list of CPUs that turbostat can run with. The cpu string returned by /sys/fs/cgroup/cpuset.cpus.effective contains a trailing '\n', but strtoul() fails to treat this as an error. That says, for the code below val = ("\n", NULL, 10); val returns 0, and errno is also not set. As a result, CPU0 is erroneously considered as allowed CPU and this causes failures when turbostat tries to run on CPU0. get_counters: Could not migrate to CPU 0 ... turbostat: re-initialized with num_cpus 8, allowed_cpus 5 get_counters: Could not migrate to CPU 0 Add a check to return immediately if '\n' or '\0' is detected. Fixes: 8c3dd2c9e542 ("tools/power/turbostat: Abstrct function for parsing cpu string") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Allow using cpu device in perf counters on hybrid ↵Patryk Wlazlyn2-7/+123
platforms Intel hybrid platforms expose different perf devices for P and E cores. Instead of one, "/sys/bus/event_source/devices/cpu" device, there are "/sys/bus/event_source/devices/{cpu_core,cpu_atom}". This, however makes it more complicated for the user, because most of the counters are available on both and had to be handled manually. This patch allows users to use "virtual" cpu device that is seemingly translated to cpu_core and cpu_atom perf devices, depending on the type of a CPU we are opening the counter for. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: Fix column printing for PMT xtal_time countersPatryk Wlazlyn1-3/+3
If the very first printed column was for a PMT counter of type xtal_time we would misalign the column header, because we were always printing the delimiter. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
35 hourstools/power turbostat: fix GCC9 build regressionTodd Brandt1-9/+6
Fix build regression seen when using old gcc-9 compiler. Signed-off-by: Todd Brandt <todd.e.brandt@intel.com> Reviewed-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
39 hoursPCI/pwrctrl: Unregister platform device only if one actually existsBrian Norris1-2/+7
If a PCI device has an associated device_node with power supplies, pci_bus_add_device() creates platform devices for use by pwrctrl. When the PCI device is removed, pci_stop_dev() uses of_find_device_by_node() to locate the related platform device, then unregisters it. But when we remove a PCI device with no associated device node, dev_of_node(dev) is NULL, and of_find_device_by_node(NULL) returns the first device with "dev->of_node == NULL". The result is that we (a) mistakenly unregister a completely unrelated platform device, leading to issues like the first trace below, and (b) dereference the NULL pointer from dev_of_node() when clearing OF_POPULATED, as in the second trace. Unregister a platform device only if there is one associated with this PCI device. This resolves issues seen when doing: # echo 1 > /sys/bus/pci/devices/.../remove Sample issue from unregistering the wrong platform device: WARNING: CPU: 0 PID: 5095 at drivers/regulator/core.c:5885 regulator_unregister+0x140/0x160 Call trace: regulator_unregister+0x140/0x160 devm_rdev_release+0x1c/0x30 release_nodes+0x68/0x100 devres_release_all+0x98/0xf8 device_unbind_cleanup+0x20/0x70 device_release_driver_internal+0x1f4/0x240 device_release_driver+0x20/0x40 bus_remove_device+0xd8/0x170 device_del+0x154/0x380 device_unregister+0x28/0x88 of_device_unregister+0x1c/0x30 pci_stop_bus_device+0x154/0x1b0 pci_stop_and_remove_bus_device_locked+0x28/0x48 remove_store+0xa0/0xb8 dev_attr_store+0x20/0x40 sysfs_kf_write+0x4c/0x68 Later NULL pointer dereference for of_node_clear_flag(NULL, OF_POPULATED): Unable to handle kernel NULL pointer dereference at virtual address 00000000000000c0 Call trace: pci_stop_bus_device+0x190/0x1b0 pci_stop_and_remove_bus_device_locked+0x28/0x48 remove_store+0xa0/0xb8 dev_attr_store+0x20/0x40 sysfs_kf_write+0x4c/0x68 Link: https://lore.kernel.org/r/20241126210443.4052876-1-briannorris@chromium.org Fixes: 681725afb6b9 ("PCI/pwrctl: Remove pwrctl device without iterating over all children of pwrctl parent") Reported-by: Saurabh Sengar <ssengar@linux.microsoft.com> Closes: https://lore.kernel.org/r/1732890621-19656-1-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Brian Norris <briannorris@chromium.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
41 hoursRevert "serial: sh-sci: Clean sci_ports[0] after at earlycon exit"Greg Kroah-Hartman1-28/+0
This reverts commit 3791ea69a4858b81e0277f695ca40f5aae40f312. It was reported to cause boot-time issues, so revert it for now. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: 3791ea69a485 ("serial: sh-sci: Clean sci_ports[0] after at earlycon exit") Cc: stable <stable@kernel.org> Cc: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
44 hourssh: intc: Fix use-after-free bug in register_intc_controller()Dan Carpenter1-1/+1
In the error handling for this function, d is freed without ever removing it from intc_list which would lead to a use after free. To fix this, let's only add it to the list after everything has succeeded. Fixes: 2dcec7a988a1 ("sh: intc: set_irq_wake() support") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
44 hourssh: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACKHuacai Chen1-1/+1
When CONFIG_CPUMASK_OFFSTACK and CONFIG_DEBUG_PER_CPU_MAPS are selected, cpu_max_bits_warn() generates a runtime warning similar as below when showing /proc/cpuinfo. Fix this by using nr_cpu_ids (the runtime limit) instead of NR_CPUS to iterate CPUs. [ 3.052463] ------------[ cut here ]------------ [ 3.059679] WARNING: CPU: 3 PID: 1 at include/linux/cpumask.h:108 show_cpuinfo+0x5e8/0x5f0 [ 3.070072] Modules linked in: efivarfs autofs4 [ 3.076257] CPU: 0 PID: 1 Comm: systemd Not tainted 5.19-rc5+ #1052 [ 3.099465] Stack : 9000000100157b08 9000000000f18530 9000000000cf846c 9000000100154000 [ 3.109127] 9000000100157a50 0000000000000000 9000000100157a58 9000000000ef7430 [ 3.118774] 90000001001578e8 0000000000000040 0000000000000020 ffffffffffffffff [ 3.128412] 0000000000aaaaaa 1ab25f00eec96a37 900000010021de80 900000000101c890 [ 3.138056] 0000000000000000 0000000000000000 0000000000000000 0000000000aaaaaa [ 3.147711] ffff8000339dc220 0000000000000001 0000000006ab4000 0000000000000000 [ 3.157364] 900000000101c998 0000000000000004 9000000000ef7430 0000000000000000 [ 3.167012] 0000000000000009 000000000000006c 0000000000000000 0000000000000000 [ 3.176641] 9000000000d3de08 9000000001639390 90000000002086d8 00007ffff0080286 [ 3.186260] 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1c [ 3.195868] ... [ 3.199917] Call Trace: [ 3.203941] [<90000000002086d8>] show_stack+0x38/0x14c [ 3.210666] [<9000000000cf846c>] dump_stack_lvl+0x60/0x88 [ 3.217625] [<900000000023d268>] __warn+0xd0/0x100 [ 3.223958] [<9000000000cf3c90>] warn_slowpath_fmt+0x7c/0xcc [ 3.231150] [<9000000000210220>] show_cpuinfo+0x5e8/0x5f0 [ 3.238080] [<90000000004f578c>] seq_read_iter+0x354/0x4b4 [ 3.245098] [<90000000004c2e90>] new_sync_read+0x17c/0x1c4 [ 3.252114] [<90000000004c5174>] vfs_read+0x138/0x1d0 [ 3.258694] [<90000000004c55f8>] ksys_read+0x70/0x100 [ 3.265265] [<9000000000cfde9c>] do_syscall+0x7c/0x94 [ 3.271820] [<9000000000202fe4>] handle_syscall+0xc4/0x160 [ 3.281824] ---[ end trace 8b484262b4b8c24c ]--- Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
3 daysbrd: decrease the number of allocated pages which discardedZhang Xianwei1-1/+3
The number of allocated pages which discarded will not decrease. Fix it. Fixes: 9ead7efc6f3f ("brd: implement discard support") Signed-off-by: Zhang Xianwei <zhang.xianwei8@zte.com.cn> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241128170056565nPKSz2vsP8K8X2uk2iaDG@zte.com.cn Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 daysblock, bfq: fix bfqq uaf in bfq_limit_depth()Yu Kuai1-13/+24
Set new allocated bfqq to bic or remove freed bfqq from bic are both protected by bfqd->lock, however bfq_limit_depth() is deferencing bfqq from bic without the lock, this can lead to UAF if the io_context is shared by multiple tasks. For example, test bfq with io_uring can trigger following UAF in v6.6: ================================================================== BUG: KASAN: slab-use-after-free in bfqq_group+0x15/0x50 Call Trace: <TASK> dump_stack_lvl+0x47/0x80 print_address_description.constprop.0+0x66/0x300 print_report+0x3e/0x70 kasan_report+0xb4/0xf0 bfqq_group+0x15/0x50 bfqq_request_over_limit+0x130/0x9a0 bfq_limit_depth+0x1b5/0x480 __blk_mq_alloc_requests+0x2b5/0xa00 blk_mq_get_new_requests+0x11d/0x1d0 blk_mq_submit_bio+0x286/0xb00 submit_bio_noacct_nocheck+0x331/0x400 __block_write_full_folio+0x3d0/0x640 writepage_cb+0x3b/0xc0 write_cache_pages+0x254/0x6c0 write_cache_pages+0x254/0x6c0 do_writepages+0x192/0x310 filemap_fdatawrite_wbc+0x95/0xc0 __filemap_fdatawrite_range+0x99/0xd0 filemap_write_and_wait_range.part.0+0x4d/0xa0 blkdev_read_iter+0xef/0x1e0 io_read+0x1b6/0x8a0 io_issue_sqe+0x87/0x300 io_wq_submit_work+0xeb/0x390 io_worker_handle_work+0x24d/0x550 io_wq_worker+0x27f/0x6c0 ret_from_fork_asm+0x1b/0x30 </TASK> Allocated by task 808602: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 __kasan_slab_alloc+0x83/0x90 kmem_cache_alloc_node+0x1b1/0x6d0 bfq_get_queue+0x138/0xfa0 bfq_get_bfqq_handle_split+0xe3/0x2c0 bfq_init_rq+0x196/0xbb0 bfq_insert_request.isra.0+0xb5/0x480 bfq_insert_requests+0x156/0x180 blk_mq_insert_request+0x15d/0x440 blk_mq_submit_bio+0x8a4/0xb00 submit_bio_noacct_nocheck+0x331/0x400 __blkdev_direct_IO_async+0x2dd/0x330 blkdev_write_iter+0x39a/0x450 io_write+0x22a/0x840 io_issue_sqe+0x87/0x300 io_wq_submit_work+0xeb/0x390 io_worker_handle_work+0x24d/0x550 io_wq_worker+0x27f/0x6c0 ret_from_fork+0x2d/0x50 ret_from_fork_asm+0x1b/0x30 Freed by task 808589: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_save_free_info+0x27/0x40 __kasan_slab_free+0x126/0x1b0 kmem_cache_free+0x10c/0x750 bfq_put_queue+0x2dd/0x770 __bfq_insert_request.isra.0+0x155/0x7a0 bfq_insert_request.isra.0+0x122/0x480 bfq_insert_requests+0x156/0x180 blk_mq_dispatch_plug_list+0x528/0x7e0 blk_mq_flush_plug_list.part.0+0xe5/0x590 __blk_flush_plug+0x3b/0x90 blk_finish_plug+0x40/0x60 do_writepages+0x19d/0x310 filemap_fdatawrite_wbc+0x95/0xc0 __filemap_fdatawrite_range+0x99/0xd0 filemap_write_and_wait_range.part.0+0x4d/0xa0 blkdev_read_iter+0xef/0x1e0 io_read+0x1b6/0x8a0 io_issue_sqe+0x87/0x300 io_wq_submit_work+0xeb/0x390 io_worker_handle_work+0x24d/0x550 io_wq_worker+0x27f/0x6c0 ret_from_fork+0x2d/0x50 ret_from_fork_asm+0x1b/0x30 Fix the problem by protecting bic_to_bfqq() with bfqd->lock. CC: Jan Kara <jack@suse.cz> Fixes: 76f1df88bbc2 ("bfq: Limit number of requests consumed by each cgroup") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20241129091509.2227136-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 daysio_uring/tctx: work around xa_store() allocation error issueJens Axboe1-1/+12
syzbot triggered the following WARN_ON: WARNING: CPU: 0 PID: 16 at io_uring/tctx.c:51 __io_uring_free+0xfa/0x140 io_uring/tctx.c:51 which is the WARN_ON_ONCE(!xa_empty(&tctx->xa)); sanity check in __io_uring_free() when a io_uring_task is going through its final put. The syzbot test case includes injecting memory allocation failures, and it very much looks like xa_store() can fail one of its memory allocations and end up with ->head being non-NULL even though no entries exist in the xarray. Until this issue gets sorted out, work around it by attempting to iterate entries in our xarray, and WARN_ON_ONCE() if one is found. Reported-by: syzbot+cc36d44ec9f368e443d3@syzkaller.appspotmail.com Link: https://lore.kernel.org/io-uring/673c1643.050a0220.87769.0066.GAE@google.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 daysRevert "s390/mm: Allow large pages for KASAN shadow mapping"Vasily Gorbik1-11/+1
This reverts commit ff123eb7741638d55abf82fac090bb3a543c1e74. Allowing large pages for KASAN shadow mappings isn't inherently wrong, but adding POPULATE_KASAN_MAP_SHADOW to large_allowed() exposes an issue in can_large_pud() and can_large_pmd(). Since commit d8073dc6bc04 ("s390/mm: Allow large pages only for aligned physical addresses"), both can_large_pud() and can_large_pmd() call _pa() to check if large page physical addresses are aligned. However, _pa() has a side effect: it allocates memory in POPULATE_KASAN_MAP_SHADOW mode. This results in massive memory leaks. The proper fix would be to address both large_allowed() and _pa()'s side effects, but for now, revert this change to avoid the leaks. Fixes: ff123eb77416 ("s390/mm: Allow large pages for KASAN shadow mapping") Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
4 daysselftests: find_symbol: Actually use load_mod() parameterGeert Uytterhoeven1-2/+2
The parameter passed to load_mod() is stored in $MOD, but never used. Obviously it was intended to be used instead of the hardcoded "test_kallsyms_b" module name. Fixes: 84b4a51fce4ccc66 ("selftests: add new kallsyms selftests") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
4 daysselftests: kallsyms: fix and clarify current test boundariesLuis Chamberlain2-3/+38
Provide and clarify the existing ranges and what you should expect. Fix the gen_test_kallsyms.sh script to accept different ranges. Fixes: 84b4a51fce4ccc66 ("selftests: add new kallsyms selftests") Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
4 daysselftests: kallsyms: fix double build stupidityLuis Chamberlain1-9/+8
The current arrangement will have the test modules rebuilt on any make without having the script or code actually change. Take Masahiro Yamada's suggested fix and cleanups on the Makefile to fix this. Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Fixes: 84b4a51fce4ccc66 ("selftests: add new kallsyms selftests") Closes: https://lore.kernel.org/all/CAK7LNATRDODmfz1tE=inV-DQqPA4G9vKH+38zMbaGdpTuFWZFw@mail.gmail.com/T/#me6c8f98e82acbee6e75a31b34bbb543eb4940b15 Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
4 daysfs/nfs/io: make nfs_start_io_*() killableMax Kellermann4-20/+66
This allows killing processes that wait for a lock when one process is stuck waiting for the NFS server. This aims to complete the coverage of NFS operations being killable, like nfs_direct_wait() does, for example. Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
4 daysnfs/blocklayout: Limit repeat device registration on failureBenjamin Coddington1-1/+14
Every pNFS SCSI IO wants to do LAYOUTGET, then within the layout find the device which can drive GETDEVINFO, then finally may need to prep the device with a reservation. This slow work makes a mess of IO latencies if one of the later steps is going to fail for awhile. If we're unable to register a SCSI device, ensure we mark the device as unavailable so that it will timeout and be re-added via GETDEVINFO. This avoids repeated doomed attempts to register a device in the IO path. Add some clarifying comments as well. Fixes: d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
4 daysnfs/blocklayout: Don't attempt unregister for invalid block deviceBenjamin Coddington1-4/+2
Since commit d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration") an unmount of a pNFS SCSI layout-enabled NFS may dereference a NULL block_device in: bl_unregister_scsi+0x16/0xe0 [blocklayoutdriver] bl_free_device+0x70/0x80 [blocklayoutdriver] bl_free_deviceid_node+0x12/0x30 [blocklayoutdriver] nfs4_put_deviceid_node+0x60/0xc0 [nfsv4] nfs4_deviceid_purge_client+0x132/0x190 [nfsv4] unset_pnfs_layoutdriver+0x59/0x60 [nfsv4] nfs4_destroy_server+0x36/0x70 [nfsv4] nfs_free_server+0x23/0xe0 [nfs] deactivate_locked_super+0x30/0xb0 cleanup_mnt+0xba/0x150 task_work_run+0x59/0x90 syscall_exit_to_user_mode+0x217/0x220 do_syscall_64+0x8e/0x160 This happens because even though we were able to create the nfs4_deviceid_node, the lookup for the device was unable to attach the block device to the pnfs_block_dev. If we never found a block device to register, we can avoid this case with the PNFS_BDEV_REGISTERED flag. Move the deref behind the test for the flag. Fixes: d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
4 dayssunrpc: fix one UAF issue caused by sunrpc kernel tcp socketLiu Jian2-0/+11
BUG: KASAN: slab-use-after-free in tcp_write_timer_handler+0x156/0x3e0 Read of size 1 at addr ffff888111f322cd by task swapper/0/0 CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.0-rc4-dirty #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 Call Trace: <IRQ> dump_stack_lvl+0x68/0xa0 print_address_description.constprop.0+0x2c/0x3d0 print_report+0xb4/0x270 kasan_report+0xbd/0xf0 tcp_write_timer_handler+0x156/0x3e0 tcp_write_timer+0x66/0x170 call_timer_fn+0xfb/0x1d0 __run_timers+0x3f8/0x480 run_timer_softirq+0x9b/0x100 handle_softirqs+0x153/0x390 __irq_exit_rcu+0x103/0x120 irq_exit_rcu+0xe/0x20 sysvec_apic_timer_interrupt+0x76/0x90 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 RIP: 0010:default_idle+0xf/0x20 Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 66 90 0f 00 2d 33 f8 25 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 RSP: 0018:ffffffffa2007e28 EFLAGS: 00000242 RAX: 00000000000f3b31 RBX: 1ffffffff4400fc7 RCX: ffffffffa09c3196 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff9f00590f RBP: 0000000000000000 R08: 0000000000000001 R09: ffffed102360835d R10: ffff88811b041aeb R11: 0000000000000001 R12: 0000000000000000 R13: ffffffffa202d7c0 R14: 0000000000000000 R15: 00000000000147d0 default_idle_call+0x6b/0xa0 cpuidle_idle_call+0x1af/0x1f0 do_idle+0xbc/0x130 cpu_startup_entry+0x33/0x40 rest_init+0x11f/0x210 start_kernel+0x39a/0x420 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0x97/0xa0 common_startup_64+0x13e/0x141 </TASK> Allocated by task 595: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 __kasan_slab_alloc+0x87/0x90 kmem_cache_alloc_noprof+0x12b/0x3f0 copy_net_ns+0x94/0x380 create_new_namespaces+0x24c/0x500 unshare_nsproxy_namespaces+0x75/0xf0 ksys_unshare+0x24e/0x4f0 __x64_sys_unshare+0x1f/0x30 do_syscall_64+0x70/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 100: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x54/0x70 kmem_cache_free+0x156/0x5d0 cleanup_net+0x5d3/0x670 process_one_work+0x776/0xa90 worker_thread+0x2e2/0x560 kthread+0x1a8/0x1f0 ret_from_fork+0x34/0x60 ret_from_fork_asm+0x1a/0x30 Reproduction script: mkdir -p /mnt/nfsshare mkdir -p /mnt/nfs/netns_1 mkfs.ext4 /dev/sdb mount /dev/sdb /mnt/nfsshare systemctl restart nfs-server chmod 777 /mnt/nfsshare exportfs -i -o rw,no_root_squash *:/mnt/nfsshare ip netns add netns_1 ip link add name veth_1_peer type veth peer veth_1 ifconfig veth_1_peer 11.11.0.254 up ip link set veth_1 netns netns_1 ip netns exec netns_1 ifconfig veth_1 11.11.0.1 ip netns exec netns_1 /root/iptables -A OUTPUT -d 11.11.0.254 -p tcp \ --tcp-flags FIN FIN -j DROP (note: In my environment, a DESTROY_CLIENTID operation is always sent immediately, breaking the nfs tcp connection.) ip netns exec netns_1 timeout -s 9 300 mount -t nfs -o proto=tcp,vers=4.1 \ 11.11.0.254:/mnt/nfsshare /mnt/nfs/netns_1 ip netns del netns_1 The reason here is that the tcp socket in netns_1 (nfs side) has been shutdown and closed (done in xs_destroy), but the FIN message (with ack) is discarded, and the nfsd side keeps sending retransmission messages. As a result, when the tcp sock in netns_1 processes the received message, it sends the message (FIN message) in the sending queue, and the tcp timer is re-established. When the network namespace is deleted, the net structure accessed by tcp's timer handler function causes problems. To fix this problem, let's hold netns refcnt for the tcp kernel socket as done in other modules. This is an ugly hack which can easily be backported to earlier kernels. A proper fix which cleans up the interfaces will follow, but may not be so easy to backport. Fixes: 26abe14379f8 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.") Signed-off-by: Liu Jian <liujian56@huawei.com> Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
4 daysSUNRPC: timeout and cancel TLS handshake with -ETIMEDOUTBenjamin Coddington1-5/+4
We've noticed a situation where an unstable TCP connection can cause the TLS handshake to timeout waiting for userspace to complete it. When this happens, we don't want to return from xs_tls_handshake_sync() with zero, as this will cause the upper xprt to be set CONNECTED, and subsequent attempts to transmit will be returned with -EPIPE. The sunrpc machine does not recover from this situation and will spin attempting to transmit. The return value of tls_handshake_cancel() can be used to detect a race with completion: * tls_handshake_cancel - cancel a pending handshake * Return values: * %true - Uncompleted handshake request was canceled * %false - Handshake request already completed or not found If true, we do not want the upper xprt to be connected, so return -ETIMEDOUT. If false, its possible the handshake request was lost and that may be the reason for our timeout. Again we do not want the upper xprt to be connected, so return -ETIMEDOUT. Ensure that we alway return an error from xs_tls_handshake_sync() if we call tls_handshake_cancel(). Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Fixes: 75eb6af7acdf ("SUNRPC: Add a TCP-with-TLS RPC transport class") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
4 dayssunrpc: clear XPRT_SOCK_UPD_TIMEOUT when reset transportLiu Jian1-0/+1
Since transport->sock has been set to NULL during reset transport, XPRT_SOCK_UPD_TIMEOUT also needs to be cleared. Otherwise, the xs_tcp_set_socket_timeouts() may be triggered in xs_tcp_send_request() to dereference the transport->sock that has been set to NULL. Fixes: 7196dbb02ea0 ("SUNRPC: Allow changing of the TCP timeout parameters on the fly") Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
4 daysnfs: ignore SB_RDONLY when mounting nfsLi Lingfeng1-1/+1
When exporting only one file system with fsid=0 on the server side, the client alternately uses the ro/rw mount options to perform the mount operation, and a new vfsmount is generated each time. It can be reproduced as follows: [root@localhost ~]# mount /dev/sda /mnt2 [root@localhost ~]# echo "/mnt2 *(rw,no_root_squash,fsid=0)" >/etc/exports [root@localhost ~]# systemctl restart nfs-server [root@localhost ~]# mount -t nfs -o ro,vers=4 127.0.0.1:/ /mnt/sdaa [root@localhost ~]# mount -t nfs -o rw,vers=4 127.0.0.1:/ /mnt/sdaa [root@localhost ~]# mount -t nfs -o ro,vers=4 127.0.0.1:/ /mnt/sdaa [root@localhost ~]# mount -t nfs -o rw,vers=4 127.0.0.1:/ /mnt/sdaa [root@localhost ~]# mount | grep nfs4 127.0.0.1:/ on /mnt/sdaa type nfs4 (ro,relatime,vers=4.2,rsize=1048576,... 127.0.0.1:/ on /mnt/sdaa type nfs4 (rw,relatime,vers=4.2,rsize=1048576,... 127.0.0.1:/ on /mnt/sdaa type nfs4 (ro,relatime,vers=4.2,rsize=1048576,... 127.0.0.1:/ on /mnt/sdaa type nfs4 (rw,relatime,vers=4.2,rsize=1048576,... [root@localhost ~]# We expected that after mounting with the ro option, using the rw option to mount again would return EBUSY, but the actual situation was not the case. As shown above, when mounting for the first time, a superblock with the ro flag will be generated, and at the same time, in do_new_mount_fc --> do_add_mount, it detects that the superblock corresponding to the current target directory is inconsistent with the currently generated one (path->mnt->mnt_sb != newmnt->mnt.mnt_sb), and a new vfsmount will be generated. When mounting with the rw option for the second time, since no matching superblock can be found in the fs_supers list, a new superblock with the rw flag will be generated again. The superblock in use (ro) is different from the newly generated superblock (rw), and a new vfsmount will be generated again. When mounting with the ro option for the third time, the superblock (ro) is found in fs_supers, the superblock in use (rw) is different from the found superblock (ro), and a new vfsmount will be generated again. We can switch between ro/rw through remount, and only one superblock needs to be generated, thus avoiding the problem of repeated generation of vfsmount caused by switching superblocks. Furthermore, This can also resolve the issue described in the link. Fixes: 275a5d24bf56 ("NFS: Error when mounting the same filesystem with different options") Link: https://lore.kernel.org/all/20240604112636.236517-3-lilingfeng@huaweicloud.com/ Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
4 daysipmr: fix build with clang and DEBUG_NET disabled.Paolo Abeni2-2/+2
Sasha reported a build issue in ipmr:: net/ipv4/ipmr.c:320:13: error: function 'ipmr_can_free_table' is not \ needed and will not be emitted \ [-Werror,-Wunneeded-internal-declaration] 320 | static bool ipmr_can_free_table(struct net *net) Apparently clang is too smart with BUILD_BUG_ON_INVALID(), let's fallback to a plain WARN_ON_ONCE(). Reported-by: Sasha Levin <sashal@kernel.org> Closes: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.11-25635-g6813e2326f1e/testrun/26111580/suite/build/test/clang-nightly-lkftconfig/details/ Fixes: 11b6e701bce9 ("ipmr: add debug check for mr table cleanup") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/ee75faa926b2446b8302ee5fc30e129d2df73b90.1732810228.git.pabeni@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 dayscifs: update internal version numberSteve French1-2/+2
To 2.52 Signed-off-by: Steve French <stfrench@microsoft.com>
4 dayscifs: unlock on error in smb3_reconfigure()Dan Carpenter1-1/+3
Unlock before returning if smb3_sync_session_ctx_passwords() fails. Fixes: 7e654ab7da03 ("cifs: during remount, make sure passwords are in sync") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
4 dayscifs: during remount, make sure passwords are in syncShyam Prasad N2-9/+75
This fixes scenarios where remount can overwrite the only currently working password, breaking reconnect. We recently introduced a password2 field in both ses and ctx structs. This was done so as to allow the client to rotate passwords for a mount without any downtime. However, when the client transparently handles password rotation, it can swap the values of the two password fields in the ses struct, but not in smb3_fs_context struct that hangs off cifs_sb. This can lead to a situation where a remount unintentionally overwrites a working password in the ses struct. In order to fix this, we first get the passwords in ctx struct in-sync with ses struct, before replacing them with what the passwords that could be passed as a part of remount. Also, in order to avoid race condition between smb2_reconnect and smb3_reconfigure, we make sure to lock session_mutex before changing password and password2 fields of the ses structure. Fixes: 35f834265e0d ("smb3: fix broken reconnect when password changing on the server by allowing password rotation") Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
4 dayscifs: support mounting with alternate password to allow password rotationMeetakshi Setiya1-7/+50
Fixes the case for example where the password specified on mount is a recently expired password, but password2 is valid. Without this patch this mount scenario would fail. This patch introduces the following changes to support password rotation on mount: 1. If an existing session is not found and the new session setup results in EACCES, EKEYEXPIRED or EKEYREVOKED, swap password and password2 (if available), and retry the mount. 2. To match the new mount with an existing session, add conditions to check if a) password and password2 of the new mount and the existing session are the same, or b) password of the new mount is the same as the password2 of the existing session, and password2 of the new mount is the same as the password of the existing session. 3. If an existing session is found, but needs reconnect, retry the session setup after swapping password and password2 (if available), in case the previous attempt results in EACCES, EKEYEXPIRED or EKEYREVOKED. Cc: stable@vger.kernel.org Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
4 daysdrm/xe: Take PM ref in delayed snapshot capture workerMatthew Brost1-0/+6
The delayed snapshot capture worker can access the GPU or VRAM both of which require a PM reference. Take a reference in this worker. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: 4f04d07c0a94 ("drm/xe: Faster devcoredump") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241126174615.2665852-5-matthew.brost@intel.com (cherry picked from commit 1c6878af115a4586a40d6c14d530fa9f93e0bd83) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
4 daysdrm/xe/migrate: use XE_BO_FLAG_PAGETABLEMatthew Auld1-1/+2
On some HW we want to avoid the host caching PTEs, since access from GPU side can be incoherent. However here the special migrate object is mapping PTEs which are written from the host and potentially cached. Use XE_BO_FLAG_PAGETABLE to ensure that non-cached mapping is used, on platforms where this matters. Fixes: 7a060d786cc1 ("drm/xe/mtl: Map PPGTT as CPU:WC") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241126181259.159713-4-matthew.auld@intel.com (cherry picked from commit febc689b27d28973cd02f667548a5dca383d859a) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
4 daysdrm/xe/migrate: fix pat index usageMatthew Auld1-1/+2
XE_CACHE_WB must be converted into the per-platform pat index for that particular caching mode, otherwise we are just encoding whatever happens to be the value of that enum. Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241126181259.159713-3-matthew.auld@intel.com (cherry picked from commit f3dc9246f9c3cd5a7d8fd70cfd805bfc52214e2e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
4 daysdrm/xe/guc_submit: fix race around suspend_pendingMatthew Auld1-2/+15
Currently in some testcases we can trigger: xe 0000:03:00.0: [drm] Assertion `exec_queue_destroyed(q)` failed! .... WARNING: CPU: 18 PID: 2640 at drivers/gpu/drm/xe/xe_guc_submit.c:1826 xe_guc_sched_done_handler+0xa54/0xef0 [xe] xe 0000:03:00.0: [drm] *ERROR* GT1: DEREGISTER_DONE: Unexpected engine state 0x00a1, guc_id=57 Looking at a snippet of corresponding ftrace for this GuC id we can see: 162.673311: xe_sched_msg_add: dev=0000:03:00.0, gt=1 guc_id=57, opcode=3 162.673317: xe_sched_msg_recv: dev=0000:03:00.0, gt=1 guc_id=57, opcode=3 162.673319: xe_exec_queue_scheduling_disable: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0x29, flags=0x0 162.674089: xe_exec_queue_kill: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0x29, flags=0x0 162.674108: xe_exec_queue_close: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa9, flags=0x0 162.674488: xe_exec_queue_scheduling_done: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa9, flags=0x0 162.678452: xe_exec_queue_deregister: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa1, flags=0x0 It looks like we try to suspend the queue (opcode=3), setting suspend_pending and triggering a disable_scheduling. The user then closes the queue. However the close will also forcefully signal the suspend fence after killing the queue, later when the G2H response for disable_scheduling comes back we have now cleared suspend_pending when signalling the suspend fence, so the disable_scheduling now incorrectly tries to also deregister the queue. This leads to warnings since the queue has yet to even be marked for destruction. We also seem to trigger errors later with trying to double unregister the same queue. To fix this tweak the ordering when handling the response to ensure we don't race with a disable_scheduling that didn't actually intend to perform an unregister. The destruction path should now also correctly wait for any pending_disable before marking as destroyed. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3371 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241122161914.321263-6-matthew.auld@intel.com (cherry picked from commit f161809b362f027b6d72bd998e47f8f0bad60a2e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
4 daysdrm/xe/guc_submit: fix race around pending_disableMatthew Auld1-7/+10
Currently in some testcases we can trigger: [drm] *ERROR* GT0: SCHED_DONE: Unexpected engine state 0x02b1, guc_id=8, runnable_state=0 [drm] *ERROR* GT0: G2H action 0x1002 failed (-EPROTO) len 3 msg 02 10 00 90 08 00 00 00 00 00 00 00 Looking at a snippet of corresponding ftrace for this GuC id we can see: 498.852891: xe_sched_msg_add: dev=0000:03:00.0, gt=0 guc_id=8, opcode=3 498.854083: xe_sched_msg_recv: dev=0000:03:00.0, gt=0 guc_id=8, opcode=3 498.855389: xe_exec_queue_kill: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x3, flags=0x0 498.855436: xe_exec_queue_lr_cleanup: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x83, flags=0x0 498.856767: xe_exec_queue_close: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x83, flags=0x0 498.862889: xe_exec_queue_scheduling_disable: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0xa9, flags=0x0 498.863032: xe_exec_queue_scheduling_disable: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x2b9, flags=0x0 498.875596: xe_exec_queue_scheduling_done: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x2b9, flags=0x0 498.875604: xe_exec_queue_deregister: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x2b1, flags=0x0 499.074483: xe_exec_queue_deregister_done: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x2b1, flags=0x0 This looks to be the two scheduling_disable racing with each other, one from the suspend (opcode=3) and then again during lr cleanup. While those two operations are serialized, the G2H portion is not, therefore when marking the queue as pending_disabled and then firing off the first request, we proceed do the same again, however the first disable response only fires after this which then clears the pending_disabled. At this point the second comes back and is processed, however the pending_disabled is no longer set, hence triggering the warning. To fix this wait for pending_disabled when doing the lr cleanup and calling disable_scheduling_deregister. Also do the same for all other disable_scheduling callers. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3515 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <mattheq.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241122161914.321263-5-matthew.auld@intel.com (cherry picked from commit ddb106d2120a0bf1c5ff87c71d059d193814da41) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
4 daysdrm/xe: Update xe2_graphics name stringMatt Roper1-1/+1
Since both Xe2 and Xe3 platforms currently use the same set of graphics IP feature flags, we associate the "graphics_xe2" structure with both IPs. Update the name string on that IP structure to clarify this and avoid confusion as Xe3 platforms start going into public CI. Fixes: 800d75bf20ae ("drm/xe/xe3: Define Xe3 feature flags") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241125194838.1190599-2-matthew.d.roper@intel.com (cherry picked from commit 4fe70f664a105391321c85b2af241001e8118d24) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
4 daysALSA: hda: improve bass speaker support for ASUS Zenbook UM5606WAJaroslav Kysela1-0/+16
This hardware has ALC294 codec with speaker NID 0x17 and bass speaker NID 0x15. This patch removes DAC NID 0x06 (without volume control) from the connection list for bass speaker NID 0x15. Both speaker PINs are routed to DAC NID 0x03 with this change. Link: https://github.com/alsa-project/alsa-ucm-conf/issues/467 Signed-off-by: Jaroslav Kysela <perex@perex.cz> Link: https://patch.msgid.link/20241128112145.3409492-1-perex@perex.cz Signed-off-by: Takashi Iwai <tiwai@suse.de>
4 dayss390/spinlock: Use flag output constraint for arch_cmpxchg_niai8()Heiko Carstens1-2/+22
Add a new variant of arch_cmpxchg_niai8() which makes use of the flag output constraint, which allows the compiler to generate slightly better code. Also rename arch_cmpxchg_niai8() to arch_try_cmpxchg_niai8() which reflects the purpose of the function and makes it consistent with other "try" variants. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
4 dayss390/spinlock: Use R constraint for arch_load_niai4()Heiko Carstens1-1/+1
The load instruction used within arch_load_niai4() has a short displacement and index register. Therefore use the R constraint to reflect this. The used Q constraint does consider an index register. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>