| Commit message (Collapse) | Author | Files | Lines |
|
The section 4.5.2 of the RISC-V AIA specification says that "any write
to a sourcecfg register of an APLIC might (or might not) cause the
corresponding interrupt-pending bit to be set to one if the rectified
input value is high (= 1) under the new source mode."
When the interrupt type is changed in the sourcecfg register, the APLIC
device might not set the corresponding pending bit, so the interrupt might
never become pending.
To handle sourcecfg register changes for level-triggered interrupts in MSI
mode, manually set the pending bit for retriggering interrupt so it gets
retriggered if it was already asserted.
Fixes: ca8df97fe679 ("irqchip/riscv-aplic: Add support for MSI-mode")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vincent Chen <vincent.chen@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240809071049.2454-1-yongxuan.wang@sifive.com
|
|
The device tree property 'xlnx,kind-of-intr' is sanity checked that the
bitmask contains only set bits which are in the range of the number of
interrupts supported by the controller.
The check is done by shifting the mask right by the number of supported
interrupts and checking the result for zero.
The data type of the mask is u32 and the number of supported interrupts is
up to 32. In case of 32 interrupts the shift is out of bounds, resulting in
a mismatch warning. The out of bounds condition is also reported by UBSAN:
UBSAN: shift-out-of-bounds in irq-xilinx-intc.c:332:22
shift exponent 32 is too large for 32-bit type 'unsigned int'
Fix it by promoting the mask to u64 for the test.
Fixes: d50466c90724 ("microblaze: intc: Refactor DT sanity check")
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/1723186944-3571957-1-git-send-email-radhey.shyam.pandey@amd.com
|
|
bcachefs_metadata_version_disk_accounting_v2 erroneously had padding
bytes in disk_accounting_key, which is a problem because we have to
guarantee that all unused bytes in disk_accounting_key are zeroed.
Fortunately 6.11 isn't out yet, so it's cheap to fix this by spinning a
new version.
Reported-by: Gabriel de Perthuis <g2p.code@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a line for capacity
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Implement bch2_accounting_invalid(); check for junk at the end, and
replicas accounting entries in particular need to be checked or we'll
pop asserts later.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The recursive aes-arm-bs module load situation reported by Russell King
is getting fixed in the crypto layer, but this in the meantime fixes the
"recursive load hangs forever" by just making the waiting for the first
module load be interruptible.
This should now match the old behavior before commit 9b9879fc0327
("modules: catch concurrent module loads, treat them as idempotent"),
which used the different "wait for module to be ready" code in
module_patient_check_exists().
End result: a recursive module load will still block, but now a signal
will interrupt it and fail the second module load, at which point the
first module will successfully complete loading.
Fixes: 9b9879fc0327 ("modules: catch concurrent module loads, treat them as idempotent")
Cc: Russell King <linux@armlinux.org.uk>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Replace the always-true check tcon->origin_fullpath with
check of server->leaf_fullpath
See https://bugzilla.kernel.org/show_bug.cgi?id=219083
The check of the new @tcon will always be true during mounting,
since @tcon->origin_fullpath will only be set after the tree is
connected to the latest common resource, as well as checking if
the prefix paths from it are fully accessible.
Fixes: 3ae872de4107 ("smb: client: fix shared DFS root mounts with different prefixes")
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Gleb Korobeynikov <gkorobeynikov@astralinux.ru>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Device tuning files made with early revision tooling may contain
configuration that can unmask IRQ signals that are owned by the host.
Adding a safe default to the regmap patch ensures that the hardware
matches the driver expectations.
Signed-off-by: Simon Trimmer <simont@opensource.cirrus.com>
Link: https://patch.msgid.link/20240807142648.46932-1-simont@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
With PREEMPT_RT enabled a spinlock_t becomes a sleeping lock.
This is usually not a problem with spinlocks used in IRQ context since
IRQ handlers get threaded. However, if IRQF_ONESHOT is set, the primary
handler won't be force-threaded and runs always in hardirq context. This is
a problem because spinlock_t requires a preemptible context on PREEMPT_RT.
In this particular instance, regmap mmio uses spinlock_t to protect the
register access and IRQF_ONESHOT is set on the IRQ. In this case, it is
actually better to do everything in threaded handler and it solves the
problem with PREEMPT_RT.
Reported-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Closes: https://lore.kernel.org/linux-amlogic/20240729131652.3012327-1-avkrasnov@salutedevices.com
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Fixes: b11d26660dff ("ASoC: meson: axg-fifo: use threaded irq to check periods")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20240807162705.4024136-1-jbrunet@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Now that most kernel work on sound has moved over to the linux-sound
mailing list so should the Cirrus Logic audio parts.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20240807140140.421359-1-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The reset GPIO of WCD9390/WCD9395 is active low and that's how it is
routed on typical boards, so correct the example DTS to use expected
polarity, instead of IRQ flag (which is a logical mistake on its own).
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20240806114931.40090-4-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The reset GPIO of WCD9380/WCD9385 is active low and that's how it is
routed on typical boards, so correct the example DTS to use expected
polarity.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20240806114931.40090-3-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The reset GPIO of WCD9340/WCD9341 is active low and that's how it is
routed on typical boards, so correct the example DTS to use expected
polarity.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20240806114931.40090-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The reset GPIO of WCD9370/WCD9375 is active low and that's how it is
routed on typical boards, so correct the example DTS to use expected
polarity.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20240806114931.40090-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Fix the missing mic on OMEN by HP Gaming Laptop 16-n0xxx by adding the
quirk entry with the board ID 8A44.
Cc: stable@vger.kernel.org
Link: https://bugzilla.suse.com/show_bug.cgi?id=1227182
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20240807170249.16490-1-tiwai@suse.de
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
We find that we need to set snd_jack_types to 0. If not,
there will be a probability of button detection errors
Signed-off-by: Zhang Yi <zhangyi@everest-semi.com>
Link: https://patch.msgid.link/20240807025356.24904-2-zhangyi@everest-semi.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Lenovo Thinkpad E14 Gen 6 (model type 21M3)
needs a quirk entry for internal mic to work.
Signed-off-by: Krzysztof Stępniak <kfs.szk@gmail.com>
Link: https://patch.msgid.link/20240807001219.1147-1-kfs.szk@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Russell King reported that the arm cbc(aes) crypto module hangs when
loaded, and Herbert Xu bisected it to commit 9b9879fc0327 ("modules:
catch concurrent module loads, treat them as idempotent"), and noted:
"So what's happening here is that the first modprobe tries to load a
fallback CBC implementation, in doing so it triggers a load of the
exact same module due to module aliases.
IOW we're loading aes-arm-bs which provides cbc(aes). However, this
needs a fallback of cbc(aes) to operate, which is made out of the
generic cbc module + any implementation of aes, or ecb(aes). The
latter happens to also be provided by aes-arm-cb so that's why it
tries to load the same module again"
So loading the aes-arm-bs module ends up wanting to recursively load
itself, and the recursive load then ends up waiting for the original
module load to complete.
This is a regression, in that it used to be that we just tried to load
the module multiple times, and then as we went on to install it the
second time we would instead just error out because the module name
already existed.
That is actually also exactly what the original "catch concurrent loads"
patch did in commit 9828ed3f695a ("module: error out early on concurrent
load of the same module file"), but it turns out that it ends up being
racy, in that erroring out before the module has been fully initialized
will cause failures in dependent module loading.
See commit ac2263b588df (which was the revert of that "error out early")
commit for details about why erroring out before the module has been
initialized is actually fundamentally racy.
Now, for the actual recursive module load (as opposed to just
concurrently loading the same module twice), the race is not an issue.
At the same time it's hard for the kernel to see that this is recursion,
because the module load is always done from a usermode helper, so the
recursion is not some simple callchain within the kernel.
End result: this is not the real fix, but this at least adds a warning
for the situation (admittedly much too late for all the debugging pain
that Russell and Herbert went through) and if we can come to a
resolution on how to detect the recursion properly, this re-organizes
the code to make that easier.
Link: https://lore.kernel.org/all/ZrFHLqvFqhzykuYw@shell.armlinux.org.uk/
Reported-by: Russell King <linux@armlinux.org.uk>
Debugged-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
.set_acl() requires a dentry, and if one isn't passed it marks the VFS
inode as not having an ACL.
This has been causing inodes with ACLs to have them "disappear" on
bcachefs filesystem, depending on which path those inodes get pulled
into the cache from.
Switching to .get_inode_acl(), like other local filesystems, fixes this.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Kernel BO's don't take a ref to the VM, we need the VM for the
delayed snapshot, so take a ref to the VM in delayed snapshot.
v2:
- Check for lrc_bo before taking a VM ref (CI)
- Check lrc_bo->vm before taking / dropping a VM ref (CI)
- Drop VM in xe_lrc_snapshot_free
v5:
- Fix commit message wording (Johnathan)
Fixes: 47058633d9c5 ("drm/xe: Move lrc snapshot capturing to xe_lrc.c")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240801154118.2547543-2-matthew.brost@intel.com
(cherry picked from commit c3bc97d2f102ddd5a8341eeb2dbae2a3e98bb46a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
In xe_hwmon_power_max_write, for PL1 disable supported case, instead of
returning after PL1 disable, PL1 enable path was also being run.
Fixed it by returning after disable.
v2: Correct typo and grammar in commit message. (Jonathan)
Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Fixes: fef6dd12b45a ("drm/xe/hwmon: Protect hwmon rw attributes with hwmon_lock")
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240801112424.1841766-1-karthik.poosa@intel.com
(cherry picked from commit 146458645e505f5eac498759bcd865cf7c0dfd9a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
A chain fence is uninitialized if not installed in a drm sync obj. Thus
if xe_sync_entry_cleanup is called and sync->chain_fence is non-NULL the
proper cleanup is dma_fence_chain_free rather than a dma-fence put.
Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2411
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2261
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240727012216.2118276-1-matthew.brost@intel.com
(cherry picked from commit 7f7a2da3bf8bc0e0f6c239af495b7050056e889c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Gustavo noticed an odd "+ 2" in rtp_mark_active() while processing
rtp rules and pointed that it should be "+ 1". In fact, while processing
entries without actions (OOB workarounds), if the WA is activated and
has OR rules, it will also inadvertently activate the very next
workaround.
Test in a LNL B0 platform by moving 18024947630 on top of 16020292621,
makes the latter become active:
$ cat /sys/kernel/debug/dri/0/gt0/workarounds
...
OOB Workarounds
18024947630
16020292621
14018094691
16022287689
13011645652
22019338487_display
In future a kunit test will be added to cover the rtp checks for entries
without actions.
Fixes: fe19328b900c ("drm/xe/rtp: Add support for entries with no action")
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240726064337.797576-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit fd6797ec50c561f085bc94e3ee26f484a52af79e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
The CPU enabled mask instead of the CPU possible mask should be used
by set_cpu_enabled(). Otherwise, we run into crash due to write to
the read-only CPU possible mask when vCPU is hot added on ARM64.
(qemu) device_add host-arm-cpu,id=cpu1,socket-id=1
Unable to handle kernel write to read-only memory at virtual address ffff800080fa7190
:
Call trace:
register_cpu+0x1a4/0x2e8
arch_register_cpu+0x84/0xd8
acpi_processor_add+0x480/0x5b0
acpi_bus_attach+0x1c4/0x300
acpi_dev_for_one_check+0x3c/0x50
device_for_each_child+0x68/0xc8
acpi_dev_for_each_child+0x48/0x80
acpi_bus_attach+0x84/0x300
acpi_bus_scan+0x74/0x220
acpi_scan_rescan_bus+0x54/0x88
acpi_device_hotplug+0x208/0x478
acpi_hotplug_work_fn+0x2c/0x50
process_one_work+0x15c/0x3c0
worker_thread+0x2ec/0x400
kthread+0x120/0x130
ret_from_fork+0x10/0x20
Fix it by passing the CPU enabled mask instead of the CPU possible
mask to set_cpu_enabled().
Fixes: 51c4767503d5 ("Merge tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux")
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
|
|
Fixed typos in various files under fs/smb/client/
Signed-off-by: Xiaxi Shen <shenxiaxi26@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Setting encryption as required in security flags was broken.
For example (to require all mounts to be encrypted by setting):
"echo 0x400c5 > /proc/fs/cifs/SecurityFlags"
Would return "Invalid argument" and log "Unsupported security flags"
This patch fixes that (e.g. allowing overriding the default for
SecurityFlags 0x00c5, including 0x40000 to require seal, ie
SMB3.1.1 encryption) so now that works and forces encryption
on subsequent mounts.
Acked-by: Bharath SM <bharathsm@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
As noted in the device errata [1-8], EEE support is not fully operational
in the KSZ8567, KSZ9477, KSZ9567, KSZ9896, and KSZ9897 devices, causing
link drops when connected to another device that supports EEE. The patch
series "net: add EEE support for KSZ9477 switch family" merged in commit
9b0bf4f77162 caused EEE support to be enabled in these devices. A fix for
this regression for the KSZ9477 alone was merged in commit 08c6d8bae48c2.
This patch extends this fix to the other affected devices.
[1] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ8567R-Errata-DS80000752.pdf
[2] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ8567S-Errata-DS80000753.pdf
[3] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ9477S-Errata-DS80000754.pdf
[4] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ9567R-Errata-DS80000755.pdf
[5] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ9567S-Errata-DS80000756.pdf
[6] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ9896C-Errata-DS80000757.pdf
[7] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ9897R-Errata-DS80000758.pdf
[8] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ9897S-Errata-DS80000759.pdf
Fixes: 69d3b36ca045 ("net: dsa: microchip: enable EEE support") # for KSZ8567/KSZ9567/KSZ9896/KSZ9897
Link: https://lore.kernel.org/netdev/137ce1ee-0b68-4c96-a717-c8164b514eec@martin-whitaker.me.uk/
Signed-off-by: Martin Whitaker <foss@martin-whitaker.me.uk>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Lukasz Majewski <lukma@denx.de>
Link: https://patch.msgid.link/20240807205209.21464-1-foss@martin-whitaker.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The 'at least one change' requirement is not applicable for context
creation, skip the check in such case.
This allows a command such as 'ethtool -X eth0 context new' to work.
The command works by mistake when using older versions of userspace
ethtool due to an incompatibility issue where rxfh.input_xfrm is passed
as zero (unset) instead of RXH_XFRM_NO_CHANGE as done with recent
userspace. This patch does not try to solve the incompatibility issue.
Link: https://lore.kernel.org/netdev/05ae8316-d3aa-4356-98c6-55ed4253c8a7@nvidia.com/
Fixes: 84a1d9c48200 ("net: ethtool: extend RXNFC API to support RSS spreading of filter matches")
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20240807173352.3501746-1-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Both ethtool_ops.rxfh_max_context_id and the default value used when
it's not specified are supposed to be exclusive maxima (the former
is documented as such; the latter, U32_MAX, cannot be used as an ID
since it equals ETH_RXFH_CONTEXT_ALLOC), but xa_alloc() expects an
inclusive maximum.
Subtract one from 'limit' to produce an inclusive maximum, and pass
that to xa_alloc().
Increase bnxt's max by one to prevent a (very minor) regression, as
BNXT_MAX_ETH_RSS_CTX is an inclusive max. This is safe since bnxt
is not actually hard-limited; BNXT_MAX_ETH_RSS_CTX is just a
leftover from old driver code that managed context IDs itself.
Rename rxfh_max_context_id to rxfh_max_num_contexts to make its
semantics (hopefully) more obvious.
Fixes: 847a8ab18676 ("net: ethtool: let the core choose RSS context IDs")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/5a2d11a599aa5b0cc6141072c01accfb7758650c.1723045898.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Using FIELD_GET() fails in configurations that don't already include
the header file indirectly:
drivers/net/pse-pd/tps23881.c: In function 'tps23881_i2c_probe':
drivers/net/pse-pd/tps23881.c:755:13: error: implicit declaration of function 'FIELD_GET' [-Wimplicit-function-declaration]
755 | if (FIELD_GET(TPS23881_REG_DEVID_MASK, ret) != TPS23881_DEVICE_ID) {
| ^~~~~~~~~
Fixes: 89108cb5c285 ("net: pse-pd: tps23881: Fix the device ID check")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20240807075455.2055224-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
PPS was not stopped in `fec_ptp_stop()`, called when
the adapter was removed. Consequentially, you couldn't
safely reload the driver with the PPS signal on.
Fixes: 32cba57ba74b ("net: fec: introduce fec_ptp_stop and use in probe fail path")
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/netdev/CAOMZO5BzcZR8PwKKwBssQq_wAGzVgf1ffwe_nhpQJjviTdxy-w@mail.gmail.com/T/#m01dcb810bfc451a492140f6797ca77443d0cb79f
Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20240807080956.2556602-1-csokas.bence@prolan.hu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some Wake-on-LAN modes such as WAKE_FILTER may only be supported by the MAC,
while others might be only supported by the PHY. Make sure that the .get_wol()
returns the union of both rather than only that of the PHY if the PHY supports
Wake-on-LAN.
Fixes: 7e400ff35cbe ("net: bcmgenet: Add support for PHY-based Wake-on-LAN")
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20240806175659.3232204-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When l2tp tunnels use a socket provided by userspace, we can hit
lockdep splats like the below when data is transmitted through another
(unrelated) userspace socket which then gets routed over l2tp.
This issue was previously discussed here:
https://lore.kernel.org/netdev/87sfialu2n.fsf@cloudflare.com/
The solution is to have lockdep treat socket locks of l2tp tunnel
sockets separately than those of standard INET sockets. To do so, use
a different lockdep subclass where lock nesting is possible.
============================================
WARNING: possible recursive locking detected
6.10.0+ #34 Not tainted
--------------------------------------------
iperf3/771 is trying to acquire lock:
ffff8881027601d8 (slock-AF_INET/1){+.-.}-{2:2}, at: l2tp_xmit_skb+0x243/0x9d0
but task is already holding lock:
ffff888102650d98 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x1848/0x1e10
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(slock-AF_INET/1);
lock(slock-AF_INET/1);
*** DEADLOCK ***
May be due to missing lock nesting notation
10 locks held by iperf3/771:
#0: ffff888102650258 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendmsg+0x1a/0x40
#1: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: __ip_queue_xmit+0x4b/0xbc0
#2: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x17a/0x1130
#3: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: process_backlog+0x28b/0x9f0
#4: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0xf9/0x260
#5: ffff888102650d98 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x1848/0x1e10
#6: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: __ip_queue_xmit+0x4b/0xbc0
#7: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x17a/0x1130
#8: ffffffff822ac1e0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0xcc/0x1450
#9: ffff888101f33258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock#2){+...}-{2:2}, at: __dev_queue_xmit+0x513/0x1450
stack backtrace:
CPU: 2 UID: 0 PID: 771 Comm: iperf3 Not tainted 6.10.0+ #34
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
<IRQ>
dump_stack_lvl+0x69/0xa0
dump_stack+0xc/0x20
__lock_acquire+0x135d/0x2600
? srso_alias_return_thunk+0x5/0xfbef5
lock_acquire+0xc4/0x2a0
? l2tp_xmit_skb+0x243/0x9d0
? __skb_checksum+0xa3/0x540
_raw_spin_lock_nested+0x35/0x50
? l2tp_xmit_skb+0x243/0x9d0
l2tp_xmit_skb+0x243/0x9d0
l2tp_eth_dev_xmit+0x3c/0xc0
dev_hard_start_xmit+0x11e/0x420
sch_direct_xmit+0xc3/0x640
__dev_queue_xmit+0x61c/0x1450
? ip_finish_output2+0xf4c/0x1130
ip_finish_output2+0x6b6/0x1130
? srso_alias_return_thunk+0x5/0xfbef5
? __ip_finish_output+0x217/0x380
? srso_alias_return_thunk+0x5/0xfbef5
__ip_finish_output+0x217/0x380
ip_output+0x99/0x120
__ip_queue_xmit+0xae4/0xbc0
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? tcp_options_write.constprop.0+0xcb/0x3e0
ip_queue_xmit+0x34/0x40
__tcp_transmit_skb+0x1625/0x1890
__tcp_send_ack+0x1b8/0x340
tcp_send_ack+0x23/0x30
__tcp_ack_snd_check+0xa8/0x530
? srso_alias_return_thunk+0x5/0xfbef5
tcp_rcv_established+0x412/0xd70
tcp_v4_do_rcv+0x299/0x420
tcp_v4_rcv+0x1991/0x1e10
ip_protocol_deliver_rcu+0x50/0x220
ip_local_deliver_finish+0x158/0x260
ip_local_deliver+0xc8/0xe0
ip_rcv+0xe5/0x1d0
? __pfx_ip_rcv+0x10/0x10
__netif_receive_skb_one_core+0xce/0xe0
? process_backlog+0x28b/0x9f0
__netif_receive_skb+0x34/0xd0
? process_backlog+0x28b/0x9f0
process_backlog+0x2cb/0x9f0
__napi_poll.constprop.0+0x61/0x280
net_rx_action+0x332/0x670
? srso_alias_return_thunk+0x5/0xfbef5
? find_held_lock+0x2b/0x80
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
handle_softirqs+0xda/0x480
? __dev_queue_xmit+0xa2c/0x1450
do_softirq+0xa1/0xd0
</IRQ>
<TASK>
__local_bh_enable_ip+0xc8/0xe0
? __dev_queue_xmit+0xa2c/0x1450
__dev_queue_xmit+0xa48/0x1450
? ip_finish_output2+0xf4c/0x1130
ip_finish_output2+0x6b6/0x1130
? srso_alias_return_thunk+0x5/0xfbef5
? __ip_finish_output+0x217/0x380
? srso_alias_return_thunk+0x5/0xfbef5
__ip_finish_output+0x217/0x380
ip_output+0x99/0x120
__ip_queue_xmit+0xae4/0xbc0
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? tcp_options_write.constprop.0+0xcb/0x3e0
ip_queue_xmit+0x34/0x40
__tcp_transmit_skb+0x1625/0x1890
tcp_write_xmit+0x766/0x2fb0
? __entry_text_end+0x102ba9/0x102bad
? srso_alias_return_thunk+0x5/0xfbef5
? __might_fault+0x74/0xc0
? srso_alias_return_thunk+0x5/0xfbef5
__tcp_push_pending_frames+0x56/0x190
tcp_push+0x117/0x310
tcp_sendmsg_locked+0x14c1/0x1740
tcp_sendmsg+0x28/0x40
inet_sendmsg+0x5d/0x90
sock_write_iter+0x242/0x2b0
vfs_write+0x68d/0x800
? __pfx_sock_write_iter+0x10/0x10
ksys_write+0xc8/0xf0
__x64_sys_write+0x3d/0x50
x64_sys_call+0xfaf/0x1f50
do_syscall_64+0x6d/0x140
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f4d143af992
Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> e9 01 cc ff ff 41 54 b8 02 00 00 0
RSP: 002b:00007ffd65032058 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f4d143af992
RDX: 0000000000000025 RSI: 00007f4d143f3bcc RDI: 0000000000000005
RBP: 00007f4d143f2b28 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f4d143f3bcc
R13: 0000000000000005 R14: 0000000000000000 R15: 00007ffd650323f0
</TASK>
Fixes: 0b2c59720e65 ("l2tp: close all race conditions in l2tp_tunnel_register()")
Suggested-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+6acef9e0a4d1f46c83d4@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=6acef9e0a4d1f46c83d4
CC: gnault@redhat.com
CC: cong.wang@bytedance.com
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Link: https://patch.msgid.link/20240806160626.1248317-1-jchapman@katalix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
dwmac4 was decoding the duplex mode from the GMAC_PHYIF_CONTROL_STATUS
register incorrectly, using GMAC_PHYIF_CTRLSTATUS_LNKMOD_MASK (value 1)
rather than GMAC_PHYIF_CTRLSTATUS_LNKMOD (bit 16). Fix this.
Fixes: 70523e639bf8c ("drivers: net: stmmac: reworking the PCS code.")
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1sbJvd-001rGD-E3@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
MTRRs have an obsolete fixed variant for fine grained caching control
of the 640K-1MB region that uses separate MSRs. This fixed variant has
a separate capability bit in the MTRR capability MSR.
So far all x86 CPUs which support MTRR have this separate bit set, so it
went unnoticed that mtrr_save_state() does not check the capability bit
before accessing the fixed MTRR MSRs.
Though on a CPU that does not support the fixed MTRR capability this
results in a #GP. The #GP itself is harmless because the RDMSR fault is
handled gracefully, but results in a WARN_ON().
Add the missing capability check to prevent this.
Fixes: 2b1f6278d77c ("[PATCH] x86: Save the MTRRs of the BSP before booting an AP")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240808000244.946864-1-ak@linux.intel.com
|
|
At the code refactoring of USB-audio quirk handling, I assumed that
the quirk entries of Stanton ScratchAmp devices were only about the
device name, and moved them completely into the rename table.
But it seems that the device requires the quirk entry so that it's
probed by the driver itself.
This re-adds back the quirk entries of ScratchAmp, but in a
minimalistic manner.
Fixes: 5436f59bc5bc ("ALSA: usb-audio: Move device rename and profile quirks to an internal table")
Link: https://patch.msgid.link/20240808081803.22300-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Ensure the test has the same name in the code as it has in the docs.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
|
|
The second tagged commit started sometimes (very rarely, but possible)
throwing WARNs from
net/core/page_pool.c:page_pool_disable_direct_recycling().
Turned out idpf frees interrupt vectors with embedded NAPIs *before*
freeing the queues making page_pools' NAPI pointers lead to freed
memory before these pools are destroyed by libeth.
It's not clear whether there are other accesses to the freed vectors
when destroying the queues, but anyway, we usually free queue/interrupt
vectors only when the queues are destroyed and the NAPIs are guaranteed
to not be referenced anywhere.
Invert the allocation and freeing logic making queue/interrupt vectors
be allocated first and freed last. Vectors don't require queues to be
present, so this is safe. Additionally, this change allows to remove
that useless queue->q_vector pointer cleanup, as vectors are still
valid when freeing the queues (+ both are freed within one function,
so it's not clear why nullify the pointers at all).
Fixes: 1c325aac10a8 ("idpf: configure resources for TX queues")
Fixes: 90912f9f4f2d ("idpf: convert header split mode to libeth + napi_build_skb()")
Reported-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20240806220923.3359860-4-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The initialization of vport interrupt consists of two functions:
1) idpf_vport_intr_init() where a generic configuration is done
2) idpf_vport_intr_req_irq() where the irq for each q_vector is
requested.
The first function used to create a base name for each interrupt using
"kasprintf()" call. Unfortunately, although that call allocated memory
for a text buffer, that memory was never released.
Fix this by removing creating the interrupt base name in 1).
Instead, always create a full interrupt name in the function 2), because
there is no need to create a base name separately, considering that the
function 2) is never called out of idpf_vport_intr_init() context.
Fixes: d4d558718266 ("idpf: initialize interrupts and enable vport")
Cc: stable@vger.kernel.org # 6.7
Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20240806220923.3359860-3-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The second tagged commit introduced a UAF, as it removed restoring
q_vector->vport pointers after reinitializating the structures.
This is due to that all queue allocation functions are performed here
with the new temporary vport structure and those functions rewrite
the backpointers to the vport. Then, this new struct is freed and
the pointers start leading to nowhere.
But generally speaking, the current logic is very fragile. It claims
to be more reliable when the system is low on memory, but in fact, it
consumes two times more memory as at the moment of running this
function, there are two vports allocated with their queues and vectors.
Moreover, it claims to prevent the driver from running into "bad state",
but in fact, any error during the rebuild leaves the old vport in the
partially allocated state.
Finally, if the interface is down when the function is called, it always
allocates a new queue set, but when the user decides to enable the
interface later on, vport_open() allocates them once again, IOW there's
a clear memory leak here.
Just don't allocate a new queue set when performing a reset, that solves
crashes and memory leaks. Readd the old queue number and reopen the
interface on rollback - that solves limbo states when the device is left
disabled and/or without HW queues enabled.
Fixes: 02cbfba1add5 ("idpf: add ethtool callbacks")
Fixes: e4891e4687c8 ("idpf: split &idpf_queue into 4 strictly-typed queue structures")
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20240806220923.3359860-2-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A recent commit has modified the code in __bnxt_reserve_rings() to
set the default RSS indirection table to default only when the number
of RX rings is changing. While this works for newer firmware that
requires RX ring reservations, it causes the regression on older
firmware not requiring RX ring resrvations (BNXT_NEW_RM() returns
false).
With older firmware, RX ring reservations are not required and so
hw_resc->resv_rx_rings is not always set to the proper value. The
comparison:
if (old_rx_rings != bp->hw_resc.resv_rx_rings)
in __bnxt_reserve_rings() may be false even when the RX rings are
changing. This will cause __bnxt_reserve_rings() to skip setting
the default RSS indirection table to default to match the current
number of RX rings. This may later cause bnxt_fill_hw_rss_tbl() to
use an out-of-range index.
We already have bnxt_check_rss_tbl_no_rmgr() to handle exactly this
scenario. We just need to move it up in bnxt_need_reserve_rings()
to be called unconditionally when using older firmware. Without the
fix, if the TX rings are changing, we'll skip the
bnxt_check_rss_tbl_no_rmgr() call and __bnxt_reserve_rings() may also
skip the bnxt_set_dflt_rss_indir_tbl() call for the reason explained
in the last paragraph. Without setting the default RSS indirection
table to default, it causes the regression:
BUG: KASAN: slab-out-of-bounds in __bnxt_hwrm_vnic_set_rss+0xb79/0xe40
Read of size 2 at addr ffff8881c5809618 by task ethtool/31525
Call Trace:
__bnxt_hwrm_vnic_set_rss+0xb79/0xe40
bnxt_hwrm_vnic_rss_cfg_p5+0xf7/0x460
__bnxt_setup_vnic_p5+0x12e/0x270
__bnxt_open_nic+0x2262/0x2f30
bnxt_open_nic+0x5d/0xf0
ethnl_set_channels+0x5d4/0xb30
ethnl_default_set_doit+0x2f1/0x620
Reported-by: Breno Leitao <leitao@debian.org>
Closes: https://lore.kernel.org/netdev/ZrC6jpghA3PWVWSB@gmail.com/
Fixes: 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Tested-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20240806053742.140304-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
bcm_sf2_mdio_register() calls of_phy_find_device() and then
phy_device_remove() in a loop to remove existing PHY devices.
of_phy_find_device() eventually calls bus_find_device(), which calls
get_device() on the returned struct device * to increment the refcount.
The current implementation does not decrement the refcount, which causes
memory leak.
This commit adds the missing phy_device_free() call to decrement the
refcount via put_device() to balance the refcount.
Fixes: 771089c2a485 ("net: dsa: bcm_sf2: Ensure that MDIO diversion is used")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20240806011327.3817861-1-joe@pf.is.s.u-tokyo.ac.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The number of fallback reasons defined in the smc_clc.h file has reached
36. For historical reasons, some are no longer quoted, and there's 33
actually in use. So, add the max value of fallback reason count to 36.
Fixes: 6ac1e6563f59 ("net/smc: support smc v2.x features validate")
Fixes: 7f0620b9940b ("net/smc: support max connections per lgr negotiation")
Fixes: 69b888e3bb4b ("net/smc: support max links per lgr negotiation in clc handshake")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
Link: https://patch.msgid.link/20240805043856.565677-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We are hit with a not easily reproducible divide-by-0 panic in padata.c at
bootup time.
[ 10.017908] Oops: divide error: 0000 1 PREEMPT SMP NOPTI
[ 10.017908] CPU: 26 PID: 2627 Comm: kworker/u1666:1 Not tainted 6.10.0-15.el10.x86_64 #1
[ 10.017908] Hardware name: Lenovo ThinkSystem SR950 [7X12CTO1WW]/[7X12CTO1WW], BIOS [PSE140J-2.30] 07/20/2021
[ 10.017908] Workqueue: events_unbound padata_mt_helper
[ 10.017908] RIP: 0010:padata_mt_helper+0x39/0xb0
:
[ 10.017963] Call Trace:
[ 10.017968] <TASK>
[ 10.018004] ? padata_mt_helper+0x39/0xb0
[ 10.018084] process_one_work+0x174/0x330
[ 10.018093] worker_thread+0x266/0x3a0
[ 10.018111] kthread+0xcf/0x100
[ 10.018124] ret_from_fork+0x31/0x50
[ 10.018138] ret_from_fork_asm+0x1a/0x30
[ 10.018147] </TASK>
Looking at the padata_mt_helper() function, the only way a divide-by-0
panic can happen is when ps->chunk_size is 0. The way that chunk_size is
initialized in padata_do_multithreaded(), chunk_size can be 0 when the
min_chunk in the passed-in padata_mt_job structure is 0.
Fix this divide-by-0 panic by making sure that chunk_size will be at least
1 no matter what the input parameters are.
Link: https://lkml.kernel.org/r/20240806174647.1050398-1-longman@redhat.com
Fixes: 004ed42638f4 ("padata: add basic support for multithreaded jobs")
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Waiman Long <longman@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Link my old gmail address to my active email.
Link: https://lkml.kernel.org/r/20240804054704.859503-1-david@ixit.cz
Signed-off-by: David Heidelberg <david@ixit.cz>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Commit 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after
many small jobs") decoupled the memcg IDs from the CSS ID space to fix the
cgroup creation failures. It introduced IDR to maintain the memcg ID
space. The IDR depends on external synchronization mechanisms for
modifications. For the mem_cgroup_idr, the idr_alloc() and idr_replace()
happen within css callback and thus are protected through cgroup_mutex
from concurrent modifications. However idr_remove() for mem_cgroup_idr
was not protected against concurrency and can be run concurrently for
different memcgs when they hit their refcnt to zero. Fix that.
We have been seeing list_lru based kernel crashes at a low frequency in
our fleet for a long time. These crashes were in different part of
list_lru code including list_lru_add(), list_lru_del() and reparenting
code. Upon further inspection, it looked like for a given object (dentry
and inode), the super_block's list_lru didn't have list_lru_one for the
memcg of that object. The initial suspicions were either the object is
not allocated through kmem_cache_alloc_lru() or somehow
memcg_list_lru_alloc() failed to allocate list_lru_one() for a memcg but
returned success. No evidence were found for these cases.
Looking more deeply, we started seeing situations where valid memcg's id
is not present in mem_cgroup_idr and in some cases multiple valid memcgs
have same id and mem_cgroup_idr is pointing to one of them. So, the most
reasonable explanation is that these situations can happen due to race
between multiple idr_remove() calls or race between
idr_alloc()/idr_replace() and idr_remove(). These races are causing
multiple memcgs to acquire the same ID and then offlining of one of them
would cleanup list_lrus on the system for all of them. Later access from
other memcgs to the list_lru cause crashes due to missing list_lru_one.
Link: https://lkml.kernel.org/r/20240802235822.1830976-1-shakeel.butt@linux.dev
Fixes: 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs")
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Muchun Song <muchun.song@linux.dev>
Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
In the shmem_suitable_orders() function, xa_find() is used to check for
conflicts in the pagecache to select suitable huge orders. However, when
checking each huge order in every loop, the aligned index is calculated
from the previous iteration, which may cause suitable huge orders to be
missed.
We should use the original index each time in the loop to calculate a new
aligned index for checking conflicts to avoid this issue.
Link: https://lkml.kernel.org/r/07433b0f16a152bffb8cee34934a5c040e8e2ad6.1722404078.git.baolin.wang@linux.alibaba.com
Fixes: e7a2ab7b3bb5 ("mm: shmem: add mTHP support for anonymous shmem")
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Barry Song <21cnbao@gmail.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Similar to commit d659b715e94ac ("mm/huge_memory: avoid PMD-size page
cache if needed"), ARM64 can support 512MB PMD-sized THP when the base
page size is 64KB, which is larger than the maximum supported page cache
size MAX_PAGECACHE_ORDER.
This is not expected. To fix this issue, use THP_ORDERS_ALL_FILE_DEFAULT
for shmem to filter allowable huge orders.
[baolin.wang@linux.alibaba.com: remove comment, per Barry]
Link: https://lkml.kernel.org/r/c55d7ef7-78aa-4ed6-b897-c3e03a3f3ab7@linux.alibaba.com
[wangkefeng.wang@huawei.com: remove local `orders']
Link: https://lkml.kernel.org/r/87769ae8-b6c6-4454-925d-1864364af9c8@huawei.com
Link: https://lkml.kernel.org/r/117121665254442c3c7f585248296495e5e2b45c.1722404078.git.baolin.wang@linux.alibaba.com
Fixes: e7a2ab7b3bb5 ("mm: shmem: add mTHP support for anonymous shmem")
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Cc: Barry Song <21cnbao@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The mem_cgroup_from_slab_obj() is supposed to be called under rcu lock or
cgroup_mutex or others which could prevent returned memcg from being
freed. Fix it by adding missing rcu read lock.
Found by code inspection.
[songmuchun@bytedance.com: only grab rcu lock when necessary, per Vlastimil]
Link: https://lkml.kernel.org/r/20240801024603.1865-1-songmuchun@bytedance.com
Link: https://lkml.kernel.org/r/20240718083607.42068-1-songmuchun@bytedance.com
Fixes: 0a97c01cd20b ("list_lru: allow explicit memcg and NUMA node selection")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When collecting coverage from softirqs, KCOV uses in_serving_softirq() to
check whether the code is running in the softirq context. Unfortunately,
in_serving_softirq() is > 0 even when the code is running in the hardirq
or NMI context for hardirqs and NMIs that happened during a softirq.
As a result, if a softirq handler contains a remote coverage collection
section and a hardirq with another remote coverage collection section
happens during handling the softirq, KCOV incorrectly detects a nested
softirq coverate collection section and prints a WARNING, as reported by
syzbot.
This issue was exposed by commit a7f3813e589f ("usb: gadget: dummy_hcd:
Switch to hrtimer transfer scheduler"), which switched dummy_hcd to using
hrtimer and made the timer's callback be executed in the hardirq context.
Change the related checks in KCOV to account for this behavior of
in_serving_softirq() and make KCOV ignore remote coverage collection
sections in the hardirq and NMI contexts.
This prevents the WARNING printed by syzbot but does not fix the inability
of KCOV to collect coverage from the __usb_hcd_giveback_urb when dummy_hcd
is in use (caused by a7f3813e589f); a separate patch is required for that.
Link: https://lkml.kernel.org/r/20240729022158.92059-1-andrey.konovalov@linux.dev
Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com>
Reported-by: syzbot+2388cdaeb6b10f0c13ac@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=2388cdaeb6b10f0c13ac
Acked-by: Marco Elver <elver@google.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Aleksandr Nogikh <nogikh@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Marcello Sylvester Bauer <sylv@sylv.io>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|