summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nvme: fix parameter check in nvme_fault_inject_init()Minjie Du2023-07-121-1/+1
| | | | | | | | Make IS_ERR() judge the debugfs_create_dir() function return. Signed-off-by: Minjie Du <duminjie@vivo.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
* nvme: warn only once for legacy uuid attributeKeith Busch2023-07-121-1/+1
| | | | | | | | | | | Report the legacy fallback behavior for uuid attributes just once instead of logging repeated warnings for the same condition every time the attribute is read. The old behavior is too spamy on the kernel logs. Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reported-by: Breno Leitao <leitao@debian.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
* nvme: fix the NVME_ID_NS_NVM_STS_MASK definitionAnkit Kumar2023-07-101-1/+1
| | | | | | | | | As per NVMe command set specification 1.0c Storage tag size is 7 bits. Fixes: 4020aad85c67 ("nvme: add support for enhanced metadata") Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
* nvmet: use PAGE_SECTORS_SHIFTDamien Le Moal2023-07-102-3/+3
| | | | | | | | | | | Replace occurences of the pattern "PAGE_SHIFT - 9" in the passthru and loop targets with PAGE_SECTORS_SHIFT. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
* nvme: add BOGUS_NID quirk for Samsung SM953Pankaj Raghav2023-07-101-0/+2
| | | | | | | | | | Add the quirk as SM953 is reporting bogus namespace ID. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217593 Reported-by: Clemens Springsguth <cspringsguth@gmail.com> Tested-by: Clemens Springsguth <cspringsguth@gmail.com> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
* blk-crypto: use dynamic lock class for blk_crypto_profile::lockEric Biggers2023-07-062-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a device-mapper device is passing through the inline encryption support of an underlying device, calls to blk_crypto_evict_key() take the blk_crypto_profile::lock of the device-mapper device, then take the blk_crypto_profile::lock of the underlying device (nested). This isn't a real deadlock, but it causes a lockdep report because there is only one lock class for all instances of this lock. Lockdep subclasses don't really work here because the hierarchy of block devices is dynamic and could have more than 2 levels. Instead, register a dynamic lock class for each blk_crypto_profile, and associate that with the lock. This avoids false-positive lockdep reports like the following: ============================================ WARNING: possible recursive locking detected 6.4.0-rc5 #2 Not tainted -------------------------------------------- fscryptctl/1421 is trying to acquire lock: ffffff80829ca418 (&profile->lock){++++}-{3:3}, at: __blk_crypto_evict_key+0x44/0x1c0 but task is already holding lock: ffffff8086b68ca8 (&profile->lock){++++}-{3:3}, at: __blk_crypto_evict_key+0xc8/0x1c0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&profile->lock); lock(&profile->lock); *** DEADLOCK *** May be due to missing lock nesting notation Fixes: 1b2628397058 ("block: Keyslot Manager for Inline Encryption") Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230610061139.212085-1-ebiggers@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block/partition: fix signedness issue for Amiga partitionsMichael Schmitz2023-07-061-1/+1
| | | | | | | | | | | | | | | | | Making 'blk' sector_t (i.e. 64 bit if LBD support is active) fails the 'blk>0' test in the partition block loop if a value of (signed int) -1 is used to mark the end of the partition block list. Explicitly cast 'blk' to signed int to allow use of -1 to terminate the partition block linked list. Fixes: b6f3f28f604b ("block: add overflow checks for Amiga partition support") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Link: https://lore.kernel.org/r/024ce4fa-cc6d-50a2-9aae-3701d0ebf668@xenosoft.de Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Martin Steigerwald <martin@lichtvoll.de> Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* gup: make the stack expansion warning a bit more targetedLinus Torvalds2023-07-051-10/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I added a warning about about GUP no longer expanding the stack in commit a425ac5365f6 ("gup: add warning if some caller would seem to want stack expansion"), but didn't really expect anybody to hit it. And it's true that nobody seems to have hit a _real_ case yet, but we certainly have a number of reports of false positives. Which not only causes extra noise in itself, but might also end up hiding any real cases if they do exist. So let's tighten up the warning condition, and replace the simplistic vma = find_vma(mm, start); if (vma && (start < vma->vm_start)) { WARN_ON_ONCE(vma->vm_flags & VM_GROWSDOWN); with a vma = gup_vma_lookup(mm, start); helper function which works otherwise like just "vma_lookup()", but with some heuristics for when to warn about gup no longer causing stack expansion. In particular, don't just warn for "below the stack", but warn if it's _just_ below the stack (with "just below" arbitrarily defined as 64kB, because why not?). And rate-limit it to at most once per hour, which means that any false positives shouldn't completely hide subsequent reports, but we won't be flooding the logs about it either. The previous code triggered when some GUP user (chromium crashpad) accessing past the end of the previous vma, for example. That has never expanded the stack, it just causes GUP to return early, and as such we shouldn't be warning about it. This is still going trigger the randomized testers, but to mitigate the noise from that, use "dump_stack()" instead of "WARN_ON_ONCE()" to get the kernel call chain. We'll get the relevant information, but syzbot shouldn't get too upset about it. Also, don't even bother with the GROWSUP case, which would be using different heuristics entirely, but only happens on parisc. Reported-by: kernel test robot <oliver.sang@intel.com> Reported-by: John Hubbard <jhubbard@nvidia.com> Reported-by: syzbot+6cf44e127903fdf9d929@syzkaller.appspotmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Revert ".gitignore: ignore *.cover and *.mbx"Linus Torvalds2023-07-051-2/+0
| | | | | | | | | | | | | This reverts commit 534066a983df0935847061c844eb178f8a53a9e7. It's actively detrimental in that it hides files that shouldn't be hidden. If I have some b4 mbx file in my git directory, it either was already applied with "git am" and is now stale, or maybe it's waiting for that to happen. In neither case is "ignore it" the right option. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'core_guards_for_6.5_rc1' of ↵Linus Torvalds2023-07-0420-17/+282
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue Pull scope-based resource management infrastructure from Peter Zijlstra: "These are the first few patches in the Scope-based Resource Management series that introduce the infrastructure but not any conversions as of yet. Adding the infrastructure now allows multiple people to start using them. Of note is that Sparse will need some work since it doesn't yet understand this attribute and might have decl-after-stmt issues" * tag 'core_guards_for_6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue: kbuild: Drop -Wdeclaration-after-statement locking: Introduce __cleanup() based infrastructure apparmor: Free up __cleanup() name dmaengine: ioat: Free up __cleanup() name
| * kbuild: Drop -Wdeclaration-after-statementPeter Zijlstra2023-06-262-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | With the advent on scope-based resource management it comes really tedious to abide by the contraints of -Wdeclaration-after-statement. It will still be recommeneded to place declarations at the start of a scope where possible, but it will no longer be enforced. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/CAHk-%3Dwi-RyoUhbChiVaJZoZXheAwnJ7OO%3DGxe85BkPAd93TwDA%40mail.gmail.com
| * locking: Introduce __cleanup() based infrastructurePeter Zijlstra2023-06-2616-1/+272
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Use __attribute__((__cleanup__(func))) to build: - simple auto-release pointers using __free() - 'classes' with constructor and destructor semantics for scope-based resource management. - lock guards based on the above classes. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230612093537.614161713%40infradead.org
| * apparmor: Free up __cleanup() namePeter Zijlstra2023-06-261-3/+3
| | | | | | | | | | | | | | | | | | In order to use __cleanup for __attribute__((__cleanup__(func))) the name must not be used for anything else. Avoid the conflict. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: John Johansen <john.johansen@canonical.com> Link: https://lkml.kernel.org/r/20230612093537.536441207%40infradead.org
| * dmaengine: ioat: Free up __cleanup() namePeter Zijlstra2023-06-261-6/+6
| | | | | | | | | | | | | | | | | | In order to use __cleanup for __attribute__((__cleanup__(func))) the name must not be used for anything else. Avoid the conflict. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dave Jiang <dave.jiang@intel.com> Link: https://lkml.kernel.org/r/20230612093537.467120754%40infradead.org
* | afs: Fix accidental truncation when storing dataDavid Howells2023-07-041-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an AFS FS.StoreData RPC call is made, amongst other things it is given the resultant file size to be. On the server, this is processed by truncating the file to new size and then writing the data. Now, kafs has a lock (vnode->io_lock) that serves to serialise operations against a specific vnode (ie. inode), but the parameters for the op are set before the lock is taken. This allows two writebacks (say sync and kswapd) to race - and if writes are ongoing the writeback for a later write could occur before the writeback for an earlier one if the latter gets interrupted. Note that afs_writepages() cannot take i_mutex and only takes a shared lock on vnode->validate_lock. Also note that the server does the truncation and the write inside a lock, so there's no problem at that end. Fix this by moving the calculation for the proposed new i_size inside the vnode->io_lock. Also reset the iterator (which we might have read from) and update the mtime setting there. Fixes: bd80d8a80e12 ("afs: Use ITER_XARRAY for writing") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/3526895.1687960024@warthog.procyon.org.uk/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'ovl-update-6.5-2' of ↵Linus Torvalds2023-07-044-564/+581
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs Pull more overlayfs updates from Amir Goldstein: "This is a small 'move code around' followup by Christian to his work on porting overlayfs to the new mount api for 6.5. It makes things a bit cleaner and simpler for the next development cycle when I hand overlayfs back over to Miklos" * tag 'ovl-update-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: ovl: move all parameter handling into params.{c,h}
| * | ovl: move all parameter handling into params.{c,h}Christian Brauner2023-07-034-564/+581
| | | | | | | | | | | | | | | | | | | | | | | | | | | While initially I thought that we couldn't move all new mount api handling into params.{c,h} it turns out it is possible. So this just moves a good chunk of code out of super.c and into params.{c,h}. Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
* | | Merge tag 'gfs2-v6.4-rc5-fixes' of ↵Linus Torvalds2023-07-0419-237/+277
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Move the freeze/thaw logic from glock callback context to process / worker thread context to prevent deadlocks - Fix a quota reference couting bug in do_qc() - Carry on deallocating inodes even when gfs2_rindex_update() fails - Retry filesystem-internal reads when they are interruped by a signal - Eliminate kmap_atomic() in favor of kmap_local_page() / memcpy_{from,to}_page() - Get rid of noop_direct_IO - And a few more minor fixes and cleanups * tag 'gfs2-v6.4-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (23 commits) gfs2: Add quota_change type gfs2: Use memcpy_{from,to}_page where appropriate gfs2: Convert remaining kmap_atomic calls to kmap_local_page gfs2: Replace deprecated kmap_atomic with kmap_local_page gfs: Get rid of unnucessary locking in inode_go_dump gfs2: gfs2_freeze_lock_shared cleanup gfs2: Replace sd_freeze_state with SDF_FROZEN flag gfs2: Rework freeze / thaw logic gfs2: Rename SDF_{FS_FROZEN => FREEZE_INITIATOR} gfs2: Reconfiguring frozen filesystem already rejected gfs2: Rename gfs2_freeze_lock{ => _shared } gfs2: Rename the {freeze,thaw}_super callbacks gfs2: Rename remaining "transaction" glock references gfs2: retry interrupted internal reads gfs2: Fix possible data races in gfs2_show_options() gfs2: Fix duplicate should_fault_in_pages() call gfs2: set FMODE_CAN_ODIRECT instead of a dummy direct_IO method gfs2: Don't remember delete unless it's successful gfs2: Update rl_unlinked before releasing rgrp lock gfs2: Fix gfs2_qa_get imbalance in gfs2_quota_hold ...
| * | | gfs2: Add quota_change typeBob Peterson2023-07-032-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Function do_qc has two main uses: (1) to re-sync the local quota changes (qd) to the master quotas, and (2) normal quota changes. In the case of normal quota changes, the change can be positive or negative, as the quota usage goes up and down. Before this patch function do_qc was distinguishing one from another by whether the resulting value is or isn't zero: In the case of a re-sync (called do_sync) the quota value is moved from the temporary value to a master value, so the amount is added to one and subtracted from the other. The problem is that since the values can be positive or negative we can occasionally run into situations where we are not doing a re-sync but the quota change just happens to cancel out the previous value. In the case of a re-sync extra references and locks are taken, and so do_qc needs to release them. In the case of a normal quota change, no extra references and locks are taken, so it must not try to release them. The problem is: if the quota change is not a re-sync but the value just happens to cancel out the original quota change, the resulting zero value fools do_qc into thinking this is a re-sync and therefore it must release the extra references. This results in problems, mainly having to do with slot reference numbers going smaller than zero. This patch introduces new constants, QC_SYNC and QC_CHANGE so do_qc can really tell the difference. For QC_SYNC calls it must release the extra references acquired by gfs2_quota_unlock's call to qd_check_sync. For QC_CHANGE calls it does not have extra references to put. Note that this allows quota changes back to a value of zero, and so I removed an assert warning related to that. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Use memcpy_{from,to}_page where appropriateAndreas Gruenbacher2023-07-033-15/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace kmap_local_page() + memcpy() + kunmap_local() sequences with memcpy_{from,to}_page() where we are not doing anything else with the mapped page. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Convert remaining kmap_atomic calls to kmap_local_pageAndreas Gruenbacher2023-07-032-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the remaining instances of kmap_atomic() ... kunmap_atomic() with kmap_local_page() ... kunmap_local(). In gfs2_write_buf_to_page(), we can call flush_dcache_page() after unmapping the page. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Replace deprecated kmap_atomic with kmap_local_pageDeepak R Varma2023-07-031-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kmap_atomic() is deprecated in favor of kmap_local_{folio,page}(). Therefore, replace kmap_atomic() with kmap_local_page() in gfs2_internal_read() and stuffed_readpage(). kmap_atomic() disables page-faults and preemption (the latter only for !PREEMPT_RT kernels), However, the code within the mapping/un-mapping in gfs2_internal_read() and stuffed_readpage() does not depend on the above-mentioned side effects. Therefore, a mere replacement of the old API with the new one is all that is required (i.e., there is no need to explicitly add any calls to pagefault_disable() and/or preempt_disable()). Signed-off-by: Deepak R Varma <drv@mailo.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs: Get rid of unnucessary locking in inode_go_dumpAndreas Gruenbacher2023-07-032-12/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 27a2660f1ef9 ("gfs2: Dump nrpages for inodes and their glocks") added some locking around reading inode->i_data.nrpages. That locking doesn't do anything really, so get rid of it. With that, the glock argument to ->go_dump() can be made const again as well. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: gfs2_freeze_lock_shared cleanupAndreas Gruenbacher2023-07-034-12/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | All the remaining users of gfs2_freeze_lock_shared() set freeze_gh to &sdp->sd_freeze_gh and flags to 0, so remove those two parameters. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Replace sd_freeze_state with SDF_FROZEN flagAndreas Gruenbacher2023-07-037-31/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace sd_freeze_state with a new SDF_FROZEN flag. There no longer is a need for indicating that a freeze is in progress (SDF_STARTING_FREEZE); we are now protecting the critical sections with the sd_freeze_mutex. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Rework freeze / thaw logicAndreas Gruenbacher2023-07-037-110/+178
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So far, at mount time, gfs2 would take the freeze glock in shared mode and then immediately drop it again, turning it into a cached glock that can be reclaimed at any time. To freeze the filesystem cluster-wide, the node initiating the freeze would take the freeze glock in exclusive mode, which would cause the freeze glock's freeze_go_sync() callback to run on each node. There, gfs2 would freeze the filesystem and schedule gfs2_freeze_func() to run. gfs2_freeze_func() would re-acquire the freeze glock in shared mode, thaw the filesystem, and drop the freeze glock again. The initiating node would keep the freeze glock held in exclusive mode. To thaw the filesystem, the initiating node would drop the freeze glock again, which would allow gfs2_freeze_func() to resume on all nodes, leaving the filesystem in the thawed state. It turns out that in freeze_go_sync(), we cannot reliably and safely freeze the filesystem. This is primarily because the final unmount of a filesystem takes a write lock on the s_umount rw semaphore before calling into gfs2_put_super(), and freeze_go_sync() needs to call freeze_super() which also takes a write lock on the same semaphore, causing a deadlock. We could work around this by trying to take an active reference on the super block first, which would prevent unmount from running at the same time. But that can fail, and freeze_go_sync() isn't actually allowed to fail. To get around this, this patch changes the freeze glock locking scheme as follows: At mount time, each node takes the freeze glock in shared mode. To freeze a filesystem, the initiating node first freezes the filesystem locally and then drops and re-acquires the freeze glock in exclusive mode. All other nodes notice that there is contention on the freeze glock in their go_callback callbacks, and they schedule gfs2_freeze_func() to run. There, they freeze the filesystem locally and drop and re-acquire the freeze glock before re-thawing the filesystem. This is happening outside of the glock state engine, so there, we are allowed to fail. From a cluster point of view, taking and immediately dropping a glock is indistinguishable from taking the glock and only dropping it upon contention, so this new scheme is compatible with the old one. Thanks to Li Dong <lidong@vivo.com> for reporting a locking bug in gfs2_freeze_func() in a previous version of this commit. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Rename SDF_{FS_FROZEN => FREEZE_INITIATOR}Andreas Gruenbacher2023-06-154-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename the SDF_FS_FROZEN flag to SDF_FREEZE_INITIATOR to indicate more clearly that the node that has this flag set is the initiator of the freeze. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com
| * | | gfs2: Reconfiguring frozen filesystem already rejectedAndreas Gruenbacher2023-06-151-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reconfiguring a frozen filesystem is already rejected in reconfigure_super(), so there is no need to check for that condition again at the filesystem level. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Rename gfs2_freeze_lock{ => _shared }Andreas Gruenbacher2023-06-155-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename gfs2_freeze_lock to gfs2_freeze_lock_shared to make it a bit more obvious that this function establishes the "thawed" state of the freeze glock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Rename the {freeze,thaw}_super callbacksAndreas Gruenbacher2023-06-152-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename gfs2_freeze to gfs2_freeze_super and gfs2_unfreeze to gfs2_thaw_super to match the names of the corresponding super operations. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Rename remaining "transaction" glock referencesAndreas Gruenbacher2023-06-155-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The transaction glock was repurposed to serve as the new freeze glock years ago. Don't refer to it as the transaction glock anymore. Also, to be more precise, call it the "freeze glock" instead of the "freeze lock". Ditto for the journal glock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: retry interrupted internal readsAndreas Gruenbacher2023-06-131-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The iomap-based read operations done by gfs2 for its system files, such as rindex, may sometimes be interrupted and return -EINTR. This confuses some users of gfs2_internal_read(). Fix that by retrying interrupted reads. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Fix possible data races in gfs2_show_options()Tuo Li2023-06-131-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some fields such as gt_logd_secs of the struct gfs2_tune are accessed without holding the lock gt_spin in gfs2_show_options(): val = sdp->sd_tune.gt_logd_secs; if (val != 30) seq_printf(s, ",commit=%d", val); And thus can cause data races when gfs2_show_options() and other functions such as gfs2_reconfigure() are concurrently executed: spin_lock(&gt->gt_spin); gt->gt_logd_secs = newargs->ar_commit; To fix these possible data races, the lock sdp->sd_tune.gt_spin is acquired before accessing the fields of gfs2_tune and released after these accesses. Further changes by Andreas: - Don't hold the spin lock over the seq_printf operations. Reported-by: BassCheck <bass@buaa.edu.cn> Signed-off-by: Tuo Li <islituo@gmail.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Fix duplicate should_fault_in_pages() callBob Peterson2023-06-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In gfs2_file_buffered_write(), we currently jump from the second call of function should_fault_in_pages() to above the first call, so should_fault_in_pages() is getting called twice in a row, causing it to accidentally fall back to single-page writes rather than trying the more efficient multi-page writes first. Fix that by moving the retry label to the correct place, behind the first call to should_fault_in_pages(). Fixes: e1fa9ea85ce8 ("gfs2: Stop using glock holder auto-demotion for now") Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: set FMODE_CAN_ODIRECT instead of a dummy direct_IO methodChristoph Hellwig2023-06-122-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit a2ad63daa88b ("VFS: add FMODE_CAN_ODIRECT file flag"), file systems can just set the FMODE_CAN_ODIRECT flag at open time instead of wiring up a dummy direct_IO method to indicate support for direct I/O. Remove .direct_IO from gfs2_aops and set FMODE_CAN_ODIRECT in gfs2_open_common for regular files that do not use data journalling. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Don't remember delete unless it's successfulBob Peterson2023-06-061-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes function evict_unlinked_inode so it does not call gfs2_inode_remember_delete until it gets a good return code from gfs2_dinode_dealloc. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Update rl_unlinked before releasing rgrp lockBob Peterson2023-06-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Function gfs2_free_di was changing the rgrp lvb count of unlinked dinodes after the lock was released. This patch moves it inside the lock. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: Fix gfs2_qa_get imbalance in gfs2_quota_holdBob Peterson2023-06-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a case in which function gfs2_quota_hold encounters an assert error and exits. The lack of gfs2_qa_put causes further problems when the inode is evicted and the get/put count is non-zero. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: ignore rindex_update failure in dinode_deallocBob Peterson2023-06-061-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, function gfs2_dinode_dealloc would abort if it got a bad return code from gfs2_rindex_update(). The problem is that it left the dinode in the unlinked (not free) state, which meant subsequent fsck would clean it up and flag an error. That meant some of our QE tests would fail. The sole purpose of gfs2_rindex_update(), in this code path, is to read in any newer rgrps added by gfs2_grow. But since this is a delete operation it won't actually use any of those new rgrps. It can really only twiddle the bits from "Unlinked" to "Free" in an existing rgrp. Therefore the error should not prevent the transition from unlinked to free. This patch makes gfs2_dinode_dealloc ignore the bad return code and proceed with freeing the dinode so the QE tests will not be tripped up. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: fix minor comment typosBob Peterson2023-06-061-2/+2
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | | gfs2: simplify gdlm_put_lock with out_free labelBob Peterson2023-06-061-13/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a new out_free label and consolidates the three places function gdlm_put_lock freed the glock. No change in functionality. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* | | | Merge tag 'pm-6.5-rc1-2' of ↵Linus Torvalds2023-07-0417-164/+180
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These add support for new hardware (ap807 and AM62A7), fix several issues in cpufreq drivers and in the operating performance points (OPP) framework, fix up intel_idle after recent changes and add documentation. Specifics: - Add missing __init annotation to one function in the intel_idle drvier (Rafael Wysocki) - Make intel_pstate use a correct scaling factor when mapping HWP performance levels to frequency values on hybrid-capable systems with disabled E-cores (Srinivas Pandruvada) - Fix Kconfig dependencies of the cpufreq-dt-platform driver (Viresh Kumar) - Add support to build cpufreq-dt-platdev as a module (Zhipeng Wang) - Don't allocate Sparc's cpufreq_driver dynamically (Viresh Kumar) - Add support for TI's AM62A7 platform (Vibhore Vardhan) - Add support for Armada's ap807 platform (Russell King (Oracle)) - Add support for StarFive JH7110 SoC (Mason Huo) - Fix voltage selection for Mediatek Socs (Daniel Golle) - Fix error handling in Tegra's cpufreq driver (Christophe JAILLET) - Document Qualcomm's IPQ8074 in DT bindings (Robert Marko) - Don't warn for disabling a non-existing frequency for imx6q cpufreq driver (Christoph Niedermaier) - Use dev_err_probe() in Qualcomm's cpufreq driver (Andrew Halaney) - Simplify performance state related logic in the OPP core (Viresh Kumar) - Fix use-after-free and improve locking around lazy_opp_tables (Viresh Kumar, Stephan Gerhold) - Minor cleanups - using dev_err_probe() and rate-limiting debug messages (Andrew Halaney, Adrián Larumbe)" * tag 'pm-6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (23 commits) cpufreq: intel_pstate: Fix scaling for hybrid-capable systems with disabled E-cores cpufreq: Make CONFIG_CPUFREQ_DT_PLATDEV depend on OF intel_idle: Add __init annotation to matchup_vm_state_with_baremetal() OPP: Properly propagate error along when failing to get icc_path OPP: Use dev_err_probe() when failing to get icc_path cpufreq: qcom-cpufreq-hw: Use dev_err_probe() when failing to get icc paths cpufreq: mediatek: correct voltages for MT7622 and MT7623 cpufreq: armada-8k: add ap807 support OPP: Simplify the over-designed pstate <-> level dance OPP: pstate is only valid for genpd OPP tables OPP: don't drop performance constraint on OPP table removal OPP: Protect `lazy_opp_tables` list with `opp_table_lock` OPP: Staticize `lazy_opp_tables` in of.c cpufreq: dt-platdev: Support building as module opp: Fix use-after-free in lazy_opp_tables after probe deferral dt-bindings: cpufreq: qcom-cpufreq-nvmem: document IPQ8074 cpufreq: dt-platdev: Blacklist ti,am62a7 SoC cpufreq: ti-cpufreq: Add support for AM62A7 OPP: rate-limit debug messages when no change in OPP is required cpufreq: imx6q: don't warn for disabling a non-existing frequency ...
| | \ \ \
| | \ \ \
| *-. \ \ \ Merge branches 'pm-cpufreq' and 'pm-cpuidle'Rafael J. Wysocki2023-07-0413-121/+132
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge CPU power management updates for 6.5-rc1: - Add missing __init annotation to one function in the intel_idle drvier (Rafael Wysocki). - Make intel_pstate use a correct scaling factor when mapping HWP performance levels to frequency values on hybrid-capable systems with disabled E-cores (Srinivas Pandruvada). - Fix Kconfig dependencies of the cpufreq-dt-platform driver (Viresh Kumar). - Add support to build cpufreq-dt-platdev as a module (Zhipeng Wang). - Don't allocate Sparc's cpufreq_driver dynamically (Viresh Kumar). - Add support for TI's AM62A7 platform (Vibhore Vardhan). - Add support for Armada's ap807 platform (Russell King (Oracle)). - Add support for StarFive JH7110 SoC (Mason Huo). - Fix voltage selection for Mediatek Socs (Daniel Golle). - Fix error handling in Tegra's cpufreq driver (Christophe JAILLET). - Document Qualcomm's IPQ8074 in DT bindings (Robert Marko). - Don't warn for disabling a non-existing frequency for imx6q cpufreq driver (Christoph Niedermaier). - Use dev_err_probe() in Qualcomm's cpufreq driver (Andrew Halaney). * pm-cpufreq: cpufreq: intel_pstate: Fix scaling for hybrid-capable systems with disabled E-cores cpufreq: Make CONFIG_CPUFREQ_DT_PLATDEV depend on OF cpufreq: qcom-cpufreq-hw: Use dev_err_probe() when failing to get icc paths cpufreq: mediatek: correct voltages for MT7622 and MT7623 cpufreq: armada-8k: add ap807 support cpufreq: dt-platdev: Support building as module dt-bindings: cpufreq: qcom-cpufreq-nvmem: document IPQ8074 cpufreq: dt-platdev: Blacklist ti,am62a7 SoC cpufreq: ti-cpufreq: Add support for AM62A7 cpufreq: imx6q: don't warn for disabling a non-existing frequency cpufreq: sparc: Don't allocate cpufreq_driver dynamically cpufreq: tegra194: Fix an error handling path in tegra194_cpufreq_probe() cpufreq: dt-platdev: Add JH7110 SOC to the allowlist * pm-cpuidle: intel_idle: Add __init annotation to matchup_vm_state_with_baremetal()
| | | * | | | intel_idle: Add __init annotation to matchup_vm_state_with_baremetal()Rafael J. Wysocki2023-06-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The caller of (recently added) matchup_vm_state_with_baremetal() is an __init function and it uses some __initdata data structures, so add the __init annotation to it for consistency. This addresses the following build warnings: WARNING: modpost: vmlinux: section mismatch in reference: matchup_vm_state_with_baremetal+0x51 (section: .text) -> intel_idle_max_cstate_reached (section: .init.text) WARNING: modpost: vmlinux: section mismatch in reference: matchup_vm_state_with_baremetal+0x62 (section: .text) -> cpuidle_state_table (section: .init.data) WARNING: modpost: vmlinux: section mismatch in reference: matchup_vm_state_with_baremetal+0x79 (section: .text) -> icpu (section: .init.data) Fixes: 0fac214bb75e ("intel_idle: Add a "Long HLT" C1 state for the VM guest mode") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
| | * | | | | cpufreq: intel_pstate: Fix scaling for hybrid-capable systems with disabled ↵Srinivas Pandruvada2023-06-301-10/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | E-cores Some system BIOS configuration may provide option to disable E-cores. As part of this change, CPUID feature for hybrid (Leaf 7 sub leaf 0, EDX[15] = 0) may not be set. But HWP performance limits will still be using a scaling factor like any other hybrid enabled system. The current check for applying scaling factor will fail when hybrid CPUID feature is not set and the only way to make sure that scaling should be applied by checking CPPC nominal frequency and nominal performance. First, or systems predating Alder Lake, the CPPC nominal frequency and nominal performance are 0, which can be used to distinguish those systems from hybrid systems with disabled E-cores. Second, if the CPPC nominal frequency and nominal performance are defined, which indicates the need to use a special scaling factor, and the nominal performance value multiplied by 100 is not equal to the nominal frequency one, use hybrid scaling factor. This can be done for all HWP systems without additional CPU model check. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Subject and changelog edits, removal of unneeded parens, comment edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | * | | | | cpufreq: Make CONFIG_CPUFREQ_DT_PLATDEV depend on OFViresh Kumar2023-06-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cpufreq-dt-platform.c driver requires CONFIG_OF to be selected. Mark it as a dependency. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202306250025.savpMM8L-lkp@intel.com/ Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | * | | | | Merge tag 'cpufreq-arm-updates-6.5' of ↵Rafael J. Wysocki2023-06-2711-110/+82
| | |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull ARM cpufreq updates for 6.5 from Viresh Kumar: "- Add support to build cpufreq-dt-platdev as module (Zhipeng Wang). - Don't allocate Sparc's cpufreq_driver dynamically (Viresh Kumar). - Add support for TI's AM62A7 platform (Vibhore Vardhan). - Add support for Armada's ap807 platform (Russell King (Oracle)). - Add support for StarFive JH7110 SoC (Mason Huo). - Fix voltage selection for Mediatek Socs (Daniel Golle). - Fix error handling in Tegra's cpufreq driver (Christophe JAILLET). - Document Qualcomm's IPQ8074 in DT bindings (Robert Marko). - Don't warn for disabling a non-existing frequency for imx6q cpufreq driver (Christoph Niedermaier). - Use dev_err_probe() in Qualcomm's cpufreq driver (Andrew Halaney)."
| | | * | | | | cpufreq: qcom-cpufreq-hw: Use dev_err_probe() when failing to get icc pathsAndrew Halaney2023-06-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This way, if there's an issue (in this case a -EPROBE_DEFER), you can get useful output: [root@dhcp19-243-150 ~]# cat /sys/kernel/debug/devices_deferred 18591000.cpufreq qcom-cpufreq-hw: Failed to find icc paths Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Reviewed-by: Bjorn Andersson <quic_bjorande@quicinc.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
| | | * | | | | cpufreq: mediatek: correct voltages for MT7622 and MT7623Daniel Golle2023-06-191-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The MT6380 regulator typically used together with MT7622 does not support the current maximum processor and SRAM voltage in the cpufreq driver (1360000uV). For MT7622 limit processor and SRAM supply voltages to 1350000uV to avoid having the tracking algorithm request unsupported voltages from the regulator. On MT7623 there is no separate SRAM supply and the maximum voltage used is 1300000uV. Create dedicated platform data for MT7623 to cover that case as well. Fixes: 0883426fd07e3 ("cpufreq: mediatek: Raise proc and sram max voltage for MT7622/7623") Suggested-by: Jia-wei Chang <Jia-wei.Chang@mediatek.com> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
| | | * | | | | cpufreq: armada-8k: add ap807 supportRussell King (Oracle)2023-06-191-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for the Armada AP807 die to armada-8k. This uses a different compatible for the CPU clock which needs to be added to the cpufreq driver. This commit takes a different approach to the WindRiver patch "cpufreq: armada: enable ap807-cpu-clk" in that rather than calling of_find_compatible_node() for each compatible, we use a table of IDs instead. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>