summaryrefslogtreecommitdiffstats
path: root/fs/9p (follow)
Commit message (Collapse)AuthorAgeFilesLines
* fs/9p: Remove unused extern declarationYueHaibing2023-07-201-2/+0
| | | | | | | | | commit bd238fb431f3 ("9p: Reorganization of 9p file system code") left behind this. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
* 9p: remove dead stores (variable set again without being read)Dominique Martinet2023-07-202-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 9p code for some reason used to initialize variables outside of the declaration, e.g. instead of just initializing the variable like this: int retval = 0 We would be doing this: int retval; retval = 0; This is perfectly fine and the compiler will just optimize dead stores anyway, but scan-build seems to think this is a problem and there are many of these warnings making the output of scan-build full of such warnings: fs/9p/vfs_inode.c:916:2: warning: Value stored to 'retval' is never read [deadcode.DeadStores] retval = 0; ^ ~ I have no strong opinion here, but if we want to regularly run scan-build we should fix these just to silence the messages. I've confirmed these all are indeed ok to remove. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
* 9p: fix ignored return value in v9fs_dir_releaseDominique Martinet2023-07-201-2/+3
| | | | | | | | | | | | | | | | | retval from filemap_fdatawrite was immediately overwritten by the following p9_fid_put: preserve any error in fdatawrite if there was any first. This fixes the following scan-build warning: fs/9p/vfs_dir.c:220:4: warning: Value stored to 'retval' is never read [deadcode.DeadStores] retval = filemap_fdatawrite(inode->i_mapping); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: 89c58cb395ec ("fs/9p: fix error reporting in v9fs_dir_release") Cc: stable@vger.kernel.org Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
* fs/9p: remove unnecessary invalidate_inode_pages2Eric Van Hensbergen2023-07-201-1/+0
| | | | | | | | | | | There was an invalidate_inode_pages2 added to readonly mmap path that is unnecessary since that path is only entered when writeback cache is disabled on mount. Cc: stable@vger.kernel.org Fixes: 1543b4c5071c ("fs/9p: remove writeback fid and fix per-file modes") Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
* fs/9p: fix type mismatch in file cache mode helperEric Van Hensbergen2023-07-201-2/+2
| | | | | | | | | | There were two flags (s_flags and s_cache) which had incorrect signed type in the parameters of the file cache mode helper function. Cc: stable@vger.kernel.org Fixes: 1543b4c5071c ("fs/9p: remove writeback fid and fix per-file modes") Reviewed-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
* fs/9p: fix typo in comparison logic for cache modeEric Van Hensbergen2023-07-201-1/+1
| | | | | | | | | | | There appears to be a typo in the comparison statement for the logic which sets a file's cache mode based on mount flags. Cc: stable@vger.kernel.org Fixes: 1543b4c5071c ("fs/9p: remove writeback fid and fix per-file modes") Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Reviewed-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
* fs/9p: remove unnecessary and overrestrictive checkEric Van Hensbergen2023-07-201-3/+1
| | | | | | | | | | | | | This eliminates a check for shared that was overrestrictive and prevented read-only mmaps when writeback caches weren't enabled. Cc: stable@vger.kernel.org Fixes: 1543b4c5071c ("fs/9p: remove writeback fid and fix per-file modes") Reported-by: Robert Schwebel <r.schwebel@pengutronix.de> Closes: https://lore.kernel.org/v9fs/ZK25XZ%2BGpR3KHIB%2F@pengutronix.de Reviewed-by: Dominique Martinet <asmadeus@codewreck.org> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
* fs/9p: Fix a datatype used with V9FS_DIRECT_IOChristophe JAILLET2023-07-101-1/+1
| | | | | | | | | | | | | | | The commit in Fixes has introduced some "enum p9_session_flags" values larger than a char. Such values are stored in "v9fs_session_info->flags" which is a char only. Turn it into an int so that the "enum p9_session_flags" values can fit in it. Fixes: 6deffc8924b5 ("fs/9p: Add new mount modes") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Dominique Martinet <asmadeus@codewreck.org> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
* 9p: Add splice_read wrapperDavid Howells2023-05-241-2/+24
| | | | | | | | | | | | | | | | | | | | | Add a splice_read wrapper for 9p. We should use copy_splice_read() if 9PL_DIRECT is set and filemap_splice_read() otherwise. Note that this doesn't seem to be particularly related to O_DIRECT. Signed-off-by: David Howells <dhowells@redhat.com> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20230522135018.2742245-15-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Merge tag 'net-6.4-rc1' of ↵Linus Torvalds2023-05-068-8/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter. Current release - regressions: - sched: act_pedit: free pedit keys on bail from offset check Current release - new code bugs: - pds_core: - Kconfig fixes (DEBUGFS and AUXILIARY_BUS) - fix mutex double unlock in error path Previous releases - regressions: - sched: cls_api: remove block_cb from driver_list before freeing - nf_tables: fix ct untracked match breakage - eth: mtk_eth_soc: drop generic vlan rx offload - sched: flower: fix error handler on replace Previous releases - always broken: - tcp: fix skb_copy_ubufs() vs BIG TCP - ipv6: fix skb hash for some RST packets - af_packet: don't send zero-byte data in packet_sendmsg_spkt() - rxrpc: timeout handling fixes after moving client call connection to the I/O thread - ixgbe: fix panic during XDP_TX with > 64 CPUs - igc: RMW the SRRCTL register to prevent losing timestamp config - dsa: mt7530: fix corrupt frames using TRGMII on 40 MHz XTAL MT7621 - r8152: - fix flow control issue of RTL8156A - fix the poor throughput for 2.5G devices - move setting r8153b_rx_agg_chg_indicate() to fix coalescing - enable autosuspend - ncsi: clear Tx enable mode when handling a Config required AEN - octeontx2-pf: macsec: fixes for CN10KB ASIC rev Misc: - 9p: remove INET dependency" * tag 'net-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits) net: bcmgenet: Remove phy_stop() from bcmgenet_netif_stop() pds_core: fix mutex double unlock in error path net/sched: flower: fix error handler on replace Revert "net/sched: flower: Fix wrong handle assignment during filter change" net/sched: flower: fix filter idr initialization net: fec: correct the counting of XDP sent frames bonding: add xdp_features support net: enetc: check the index of the SFI rather than the handle sfc: Add back mailing list virtio_net: suppress cpu stall when free_unused_bufs ice: block LAN in case of VF to VF offload net: dsa: mt7530: fix network connectivity with multiple CPU ports net: dsa: mt7530: fix corrupt frames using trgmii on 40 MHz XTAL MT7621 9p: Remove INET dependency netfilter: nf_tables: fix ct untracked match breakage af_packet: Don't send zero-byte data in packet_sendmsg_spkt(). igc: read before write to SRRCTL register pds_core: add AUXILIARY_BUS and NET_DEVLINK to Kconfig pds_core: remove CONFIG_DEBUG_FS from makefile ionic: catch failure from devlink_alloc ...
| * 9p: Remove INET dependencyJason Andryuk2023-05-048-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 9pfs can run over assorted transports, so it doesn't have an INET dependency. Drop it and remove the includes of linux/inet.h. NET_9P_FD/trans_fd.o builds without INET or UNIX and is usable over plain file descriptors. However, tcp and unix functionality is still built and would generate runtime failures if used. Add imply INET and UNIX to NET_9P_FD, so functionality is enabled by default but can still be explicitly disabled. This allows configuring 9pfs over Xen with INET and UNIX disabled. Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge tag '9p-6.4-for-linus' of ↵Linus Torvalds2023-05-0412-399/+319
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p updates from Eric Van Hensbergen: "This includes a number of patches that didn't quite make the cut last merge window while we addressed some outstanding issues and review comments. It includes some new caching modes for those that only want readahead caches and reworks how we do writeback caching so we are not keeping extra references around which both causes performance problems and uses lots of additional resources on the server. It also includes a new flag to force disabling of xattrs which can also cause major performance issues, particularly if the underlying filesystem on the server doesn't support them. Finally it adds a couple of additional mount options to better support directio and enabling caches when the server doesn't support qid.version. There was one late-breaking bug report that has also been included as its own patch where I forgot to propagate an embarassing bit-logic fix to the various variations of open" * tag '9p-6.4-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: Fix bit operation logic error fs/9p: Rework cache modes and add new options to Documentation fs/9p: remove writeback fid and fix per-file modes fs/9p: Add new mount modes 9p: Add additional debug flags and open modes fs/9p: allow disable of xattr support on mount fs/9p: Remove unnecessary superblock flags fs/9p: Consolidate file operations and add readahead and writeback
| * fs/9p: Fix bit operation logic errorEric Van Hensbergen2023-04-282-2/+2
| | | | | | | | | | | | | | | | | | | | This re-introduces a fix that somehow got dropped during rebase of the current series in for-next. When writeback is enabled, opens are forced to support both read and write operations but with the logic error other flags may be dropped unintentionaly. Reported-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
| * fs/9p: Rework cache modes and add new options to DocumentationEric Van Hensbergen2023-04-099-94/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | Switch cache modes to a bit-mask and use legacy cache names as shortcuts. Update documentation to include information on both shortcuts and bitmasks. This patch also fixes missing guards related to fscache. Update the documentation for new mount flags and cache modes. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
| * fs/9p: remove writeback fid and fix per-file modesEric Van Hensbergen2023-03-278-186/+135
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the creating of an additional writeback_fid for opened files. The patch addresses problems when files were opened write-only or getattr on files with dirty caches. This patch also incorporates information about cache behavior in the fid for every file. This allows us to reflect cache behavior from mount flags, open mode, and information from the server to inform readahead and writeback behavior. This includes adding support for a 9p semantic that qid.version==0 is used to mark a file as non-cachable which is important for synthetic files. This may have a side-effect of not supporting caching on certain legacy file servers that do not properly set qid.version. There is also now a mount flag which can disable the qid.version behavior. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
| * fs/9p: Add new mount modesEric Van Hensbergen2023-03-272-3/+18
| | | | | | | | | | | | | | | | Add some additional mount modes for cache management including specifying directio as a mount option and an option for ignore qid.version for determining whether or not a file is cacheable. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
| * fs/9p: allow disable of xattr support on mountEric Van Hensbergen2023-03-273-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | xattr creates a lot of additional messages for 9p in the current implementation. This allows users to conditionalize xattr support on 9p mount if they are on a connection with bad latency. Using this flag is also useful when debugging other aspects of 9p as it reduces the noise in the trace files. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org> Reviewed-by: Dominique Martinet <asmadeus@codewreck.org>
| * fs/9p: Remove unnecessary superblock flagsEric Van Hensbergen2023-03-271-3/+1
| | | | | | | | | | | | | | | | | | | | | | These flags just add unnecessary extra operations. When 9p is run without cache, it inherently implements these options so we don't need them in the superblock (which ends up sending extraneous fsyncs, etc.). User can still request these options on mount, but we don't need to set them as default. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
| * fs/9p: Consolidate file operations and add readahead and writebackEric Van Hensbergen2023-03-277-162/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We had 3 different sets of file operations across 2 different protocol variants differentiated by cache which really only changed 3 functions. But the real problem is that certain file modes, mount options, and other factors weren't being considered when we decided whether or not to use caches. This consolidates all the operations and switches to conditionals within a common set to decide whether or not to do different aspects of caching. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org> Reviewed-by: Dominique Martinet <asmadeus@codewreck.org>
* | Merge tag 'v6.4/vfs.acl' of ↵Linus Torvalds2023-04-241-4/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull acl updates from Christian Brauner: "After finishing the introduction of the new posix acl api last cycle the generic POSIX ACL xattr handlers are still around in the filesystems xattr handlers for two reasons: (1) Because a few filesystems rely on the ->list() method of the generic POSIX ACL xattr handlers in their ->listxattr() inode operation. (2) POSIX ACLs are only available if IOP_XATTR is raised. The IOP_XATTR flag is raised in inode_init_always() based on whether the sb->s_xattr pointer is non-NULL. IOW, the registered xattr handlers of the filesystem are used to raise IOP_XATTR. Removing the generic POSIX ACL xattr handlers from all filesystems would risk regressing filesystems that only implement POSIX ACL support and no other xattrs (nfs3 comes to mind). This contains the work to decouple POSIX ACLs from the IOP_XATTR flag as they don't depend on xattr handlers anymore. So it's now possible to remove the generic POSIX ACL xattr handlers from the sb->s_xattr list of all filesystems. This is a crucial step as the generic POSIX ACL xattr handlers aren't used for POSIX ACLs anymore and POSIX ACLs don't depend on the xattr infrastructure anymore. Adressing problem (1) will require more long-term work. It would be best to get rid of the ->list() method of xattr handlers completely at some point. For erofs, ext{2,4}, f2fs, jffs2, ocfs2, and reiserfs the nop POSIX ACL xattr handler is kept around so they can continue to use array-based xattr handler indexing. This update does simplify the ->listxattr() implementation of all these filesystems however" * tag 'v6.4/vfs.acl' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: acl: don't depend on IOP_XATTR ovl: check for ->listxattr() support reiserfs: rework priv inode handling fs: rename generic posix acl handlers reiserfs: rework ->listxattr() implementation fs: simplify ->listxattr() implementation fs: drop unused posix acl handlers xattr: remove unused argument xattr: add listxattr helper xattr: simplify listxattr helpers
| * | fs: drop unused posix acl handlersChristian Brauner2023-03-061-4/+0
| |/ | | | | | | | | | | | | | | | | | | | | Remove struct posix_acl_{access,default}_handler for all filesystems that don't depend on the xattr handler in their inode->i_op->listxattr() method in any way. There's nothing more to do than to simply remove the handler. It's been effectively unused ever since we introduced the new posix acl api. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
* / 9P FS: Fix wild-memory-access write in v9fs_get_aclIvan Orlov2023-03-271-3/+5
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KASAN reported the following issue: [ 36.825817][ T5923] BUG: KASAN: wild-memory-access in v9fs_get_acl+0x1a4/0x390 [ 36.827479][ T5923] Write of size 4 at addr 9fffeb37f97f1c00 by task syz-executor798/5923 [ 36.829303][ T5923] [ 36.829846][ T5923] CPU: 0 PID: 5923 Comm: syz-executor798 Not tainted 6.2.0-syzkaller-18302-g596b6b709632 #0 [ 36.832110][ T5923] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 [ 36.834464][ T5923] Call trace: [ 36.835196][ T5923] dump_backtrace+0x1c8/0x1f4 [ 36.836229][ T5923] show_stack+0x2c/0x3c [ 36.837100][ T5923] dump_stack_lvl+0xd0/0x124 [ 36.838103][ T5923] print_report+0xe4/0x4c0 [ 36.839068][ T5923] kasan_report+0xd4/0x130 [ 36.840052][ T5923] kasan_check_range+0x264/0x2a4 [ 36.841199][ T5923] __kasan_check_write+0x2c/0x3c [ 36.842216][ T5923] v9fs_get_acl+0x1a4/0x390 [ 36.843232][ T5923] v9fs_mount+0x77c/0xa5c [ 36.844163][ T5923] legacy_get_tree+0xd4/0x16c [ 36.845173][ T5923] vfs_get_tree+0x90/0x274 [ 36.846137][ T5923] do_new_mount+0x25c/0x8c8 [ 36.847066][ T5923] path_mount+0x590/0xe58 [ 36.848147][ T5923] __arm64_sys_mount+0x45c/0x594 [ 36.849273][ T5923] invoke_syscall+0x98/0x2c0 [ 36.850421][ T5923] el0_svc_common+0x138/0x258 [ 36.851397][ T5923] do_el0_svc+0x64/0x198 [ 36.852398][ T5923] el0_svc+0x58/0x168 [ 36.853224][ T5923] el0t_64_sync_handler+0x84/0xf0 [ 36.854293][ T5923] el0t_64_sync+0x190/0x194 Calling '__v9fs_get_acl' method in 'v9fs_get_acl' creates the following chain of function calls: __v9fs_get_acl v9fs_fid_get_acl v9fs_fid_xattr_get p9_client_xattrwalk Function p9_client_xattrwalk accepts a pointer to u64-typed variable attr_size and puts some u64 value into it. However, after the executing the p9_client_xattrwalk, in some circumstances we assign the value of u64-typed variable 'attr_size' to the variable 'retval', which we will return. However, the type of 'retval' is ssize_t, and if the value of attr_size is larger than SSIZE_MAX, we will face the signed type overflow. If the overflow occurs, the result of v9fs_fid_xattr_get may be negative, but not classified as an error. When we try to allocate an acl with 'broken' size we receive an error, but don't process it. When we try to free this acl, we face the 'wild-memory-access' error (because it wasn't allocated). This patch will add new condition to the 'v9fs_fid_xattr_get' function, so it will return an EOVERFLOW error if the 'attr_size' is larger than SSIZE_MAX. In this version of the patch I simplified the condition. In previous (v2) version of the patch I removed explicit type conversion and added separate condition to check the possible overflow and return an error (in v1 version I've just modified the existing condition). Tested via syzkaller. Suggested-by: Christian Schoenebeck <linux_oss@crudebyte.com> Reported-by: syzbot+cb1d16facb3cc90de5fb@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=fbbef66d9e4d096242f3617de5d14d12705b4659 Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
* Merge tag '9p-6.3-for-linus-part1' of ↵Linus Torvalds2023-03-016-14/+14
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p updates from Eric Van Hensbergen: - some fixes and cleanup setting up for a larger set of performance patches I've been working on - a contributed fixes relating to 9p/rdma - some contributed fixes relating to 9p/xen * tag '9p-6.3-for-linus-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: fix error reporting in v9fs_dir_release net/9p: fix bug in client create for .L 9p/rdma: unmap receive dma buffer in rdma_request()/post_recv() 9p/xen: fix connection sequence 9p/xen: fix version parsing fs/9p: Expand setup of writeback cache to all levels net/9p: Adjust maximum MSIZE to account for p9 header
| * fs/9p: fix error reporting in v9fs_dir_releaseEric Van Hensbergen2023-02-241-3/+4
| | | | | | | | | | | | | | | | | | | | | | Checking the p9_fid_put value allows us to pass back errors involved if we end up clunking the fid as part of dir_release. This can help with more graceful response to errors in writeback among other things. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org> Reviewed-by: Dominique Martinet <asmadeus@codewreck.org>
| * fs/9p: Expand setup of writeback cache to all levelsEric Van Hensbergen2023-02-235-11/+10
| | | | | | | | | | | | | | | | | | If cache is enabled, make sure we are putting the right things in place (mainly impacts mmap). This also sets us up for more cache levels. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org> Reviewed-by: Dominique Martinet <asmadeus@codewreck.org>
* | Merge tag 'fs.idmapped.v6.3' of ↵Linus Torvalds2023-02-207-45/+45
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull vfs idmapping updates from Christian Brauner: - Last cycle we introduced the dedicated struct mnt_idmap type for mount idmapping and the required infrastucture in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). As promised in last cycle's pull request message this converts everything to rely on struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevant on the mount level. Especially for non-vfs developers without detailed knowledge in this area this was a potential source for bugs. This finishes the conversion. Instead of passing the plain namespace around this updates all places that currently take a pointer to a mnt_userns with a pointer to struct mnt_idmap. Now that the conversion is done all helpers down to the really low-level helpers only accept a struct mnt_idmap argument instead of two namespace arguments. Conflating mount and other idmappings will now cause the compiler to complain loudly thus eliminating the possibility of any bugs. This makes it impossible for filesystem developers to mix up mount and filesystem idmappings as they are two distinct types and require distinct helpers that cannot be used interchangeably. Everything associated with struct mnt_idmap is moved into a single separate file. With that change no code can poke around in struct mnt_idmap. It can only be interacted with through dedicated helpers. That means all filesystems are and all of the vfs is completely oblivious to the actual implementation of idmappings. We are now also able to extend struct mnt_idmap as we see fit. For example, we can decouple it completely from namespaces for users that don't require or don't want to use them at all. We can also extend the concept of idmappings so we can cover filesystem specific requirements. In combination with the vfs{g,u}id_t work we finished in v6.2 this makes this feature substantially more robust and thus difficult to implement wrong by a given filesystem and also protects the vfs. - Enable idmapped mounts for tmpfs and fulfill a longstanding request. A long-standing request from users had been to make it possible to create idmapped mounts for tmpfs. For example, to share the host's tmpfs mount between multiple sandboxes. This is a prerequisite for some advanced Kubernetes cases. Systemd also has a range of use-cases to increase service isolation. And there are more users of this. However, with all of the other work going on this was way down on the priority list but luckily someone other than ourselves picked this up. As usual the patch is tiny as all the infrastructure work had been done multiple kernel releases ago. In addition to all the tests that we already have I requested that Rodrigo add a dedicated tmpfs testsuite for idmapped mounts to xfstests. It is to be included into xfstests during the v6.3 development cycle. This should add a slew of additional tests. * tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (26 commits) shmem: support idmapped mounts for tmpfs fs: move mnt_idmap fs: port vfs{g,u}id helpers to mnt_idmap fs: port fs{g,u}id helpers to mnt_idmap fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap fs: port i_{g,u}id_{needs_}update() to mnt_idmap quota: port to mnt_idmap fs: port privilege checking helpers to mnt_idmap fs: port inode_owner_or_capable() to mnt_idmap fs: port inode_init_owner() to mnt_idmap fs: port acl to mnt_idmap fs: port xattr to mnt_idmap fs: port ->permission() to pass mnt_idmap fs: port ->fileattr_set() to pass mnt_idmap fs: port ->set_acl() to pass mnt_idmap fs: port ->get_acl() to pass mnt_idmap fs: port ->tmpfile() to pass mnt_idmap fs: port ->rename() to pass mnt_idmap fs: port ->mknod() to pass mnt_idmap fs: port ->mkdir() to pass mnt_idmap ...
| * | fs: port inode_owner_or_capable() to mnt_idmapChristian Brauner2023-01-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: port inode_init_owner() to mnt_idmapChristian Brauner2023-01-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: port acl to mnt_idmapChristian Brauner2023-01-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: port xattr to mnt_idmapChristian Brauner2023-01-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: port ->set_acl() to pass mnt_idmapChristian Brauner2023-01-192-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: port ->get_acl() to pass mnt_idmapChristian Brauner2023-01-192-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: port ->rename() to pass mnt_idmapChristian Brauner2023-01-192-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: port ->mknod() to pass mnt_idmapChristian Brauner2023-01-192-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: port ->mkdir() to pass mnt_idmapChristian Brauner2023-01-192-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: port ->symlink() to pass mnt_idmapChristian Brauner2023-01-192-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: port ->create() to pass mnt_idmapChristian Brauner2023-01-192-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: port ->getattr() to pass mnt_idmapChristian Brauner2023-01-192-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: port ->setattr() to pass mnt_idmapChristian Brauner2023-01-194-10/+10
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
* / filelock: move file locking definitions to separate header fileJeff Layton2023-01-111-0/+1
|/ | | | | | | | | | | | | | | | | | | | | | | The file locking definitions have lived in fs.h since the dawn of time, but they are only used by a small subset of the source files that include it. Move the file locking definitions to a new header file, and add the appropriate #include directives to the source files that need them. By doing this we trim down fs.h a bit and limit the amount of rebuilding that has to be done when we make changes to the file locking APIs. Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Acked-by: Steve French <stfrench@microsoft.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org>
* Merge tag '9p-for-6.2-rc1' of https://github.com/martinetd/linuxLinus Torvalds2022-12-239-9/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull 9p updates from Dominique Martinet: - improve p9_check_errors to check buffer size instead of msize when possible (e.g. not zero-copy) - some more syzbot and KCSAN fixes - minor headers include cleanup * tag '9p-for-6.2-rc1' of https://github.com/martinetd/linux: 9p/client: fix data race on req->status net/9p: fix response size check in p9_check_errors() net/9p: distinguish zero-copy requests 9p/xen: do not memcpy header into req->rc 9p: set req refcount to zero to avoid uninitialized usage 9p/net: Remove unneeded idr.h #include 9p/fs: Remove unneeded idr.h #include
| * 9p/fs: Remove unneeded idr.h #includeChristophe JAILLET2022-12-029-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | The 9p fs does not use IDR or IDA functionalities. So there is no point in including <linux/idr.h>. Remove it. Link: https://lkml.kernel.org/r/3d1e0ed9714eaee7e18d9f5b0b4bfa49b00b286d.1669553950.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> [Dominique: reword subject] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* | Merge tag 'fs.acl.rework.v6.2' of ↵Linus Torvalds2022-12-135-146/+170
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull VFS acl updates from Christian Brauner: "This contains the work that builds a dedicated vfs posix acl api. The origins of this work trace back to v5.19 but it took quite a while to understand the various filesystem specific implementations in sufficient detail and also come up with an acceptable solution. As we discussed and seen multiple times the current state of how posix acls are handled isn't nice and comes with a lot of problems: The current way of handling posix acls via the generic xattr api is error prone, hard to maintain, and type unsafe for the vfs until we call into the filesystem's dedicated get and set inode operations. It is already the case that posix acls are special-cased to death all the way through the vfs. There are an uncounted number of hacks that operate on the uapi posix acl struct instead of the dedicated vfs struct posix_acl. And the vfs must be involved in order to interpret and fixup posix acls before storing them to the backing store, caching them, reporting them to userspace, or for permission checking. Currently a range of hacks and duct tape exist to make this work. As with most things this is really no ones fault it's just something that happened over time. But the code is hard to understand and difficult to maintain and one is constantly at risk of introducing bugs and regressions when having to touch it. Instead of continuing to hack posix acls through the xattr handlers this series builds a dedicated posix acl api solely around the get and set inode operations. Going forward, the vfs_get_acl(), vfs_remove_acl(), and vfs_set_acl() helpers must be used in order to interact with posix acls. They operate directly on the vfs internal struct posix_acl instead of abusing the uapi posix acl struct as we currently do. In the end this removes all of the hackiness, makes the codepaths easier to maintain, and gets us type safety. This series passes the LTP and xfstests suites without any regressions. For xfstests the following combinations were tested: - xfs - ext4 - btrfs - overlayfs - overlayfs on top of idmapped mounts - orangefs - (limited) cifs There's more simplifications for posix acls that we can make in the future if the basic api has made it. A few implementation details: - The series makes sure to retain exactly the same security and integrity module permission checks. Especially for the integrity modules this api is a win because right now they convert the uapi posix acl struct passed to them via a void pointer into the vfs struct posix_acl format to perform permission checking on the mode. There's a new dedicated security hook for setting posix acls which passes the vfs struct posix_acl not a void pointer. Basing checking on the posix acl stored in the uapi format is really unreliable. The vfs currently hacks around directly in the uapi struct storing values that frankly the security and integrity modules can't correctly interpret as evidenced by bugs we reported and fixed in this area. It's not necessarily even their fault it's just that the format we provide to them is sub optimal. - Some filesystems like 9p and cifs need access to the dentry in order to get and set posix acls which is why they either only partially or not even at all implement get and set inode operations. For example, cifs allows setxattr() and getxattr() operations but doesn't allow permission checking based on posix acls because it can't implement a get acl inode operation. Thus, this patch series updates the set acl inode operation to take a dentry instead of an inode argument. However, for the get acl inode operation we can't do this as the old get acl method is called in e.g., generic_permission() and inode_permission(). These helpers in turn are called in various filesystem's permission inode operation. So passing a dentry argument to the old get acl inode operation would amount to passing a dentry to the permission inode operation which we shouldn't and probably can't do. So instead of extending the existing inode operation Christoph suggested to add a new one. He also requested to ensure that the get and set acl inode operation taking a dentry are consistently named. So for this version the old get acl operation is renamed to ->get_inode_acl() and a new ->get_acl() inode operation taking a dentry is added. With this we can give both 9p and cifs get and set acl inode operations and in turn remove their complex custom posix xattr handlers. In the future I hope to get rid of the inode method duplication but it isn't like we have never had this situation. Readdir is just one example. And frankly, the overall gain in type safety and the more pleasant api wise are simply too big of a benefit to not accept this duplication for a while. - We've done a full audit of every codepaths using variant of the current generic xattr api to get and set posix acls and surprisingly it isn't that many places. There's of course always a chance that we might have missed some and if so I'm sure we'll find them soon enough. The crucial codepaths to be converted are obviously stacking filesystems such as ecryptfs and overlayfs. For a list of all callers currently using generic xattr api helpers see [2] including comments whether they support posix acls or not. - The old vfs generic posix acl infrastructure doesn't obey the create and replace semantics promised on the setxattr(2) manpage. This patch series doesn't address this. It really is something we should revisit later though. The patches are roughly organized as follows: (1) Change existing set acl inode operation to take a dentry argument (Intended to be a non-functional change) (2) Rename existing get acl method (Intended to be a non-functional change) (3) Implement get and set acl inode operations for filesystems that couldn't implement one before because of the missing dentry. That's mostly 9p and cifs (Intended to be a non-functional change) (4) Build posix acl api, i.e., add vfs_get_acl(), vfs_remove_acl(), and vfs_set_acl() including security and integrity hooks (Intended to be a non-functional change) (5) Implement get and set acl inode operations for stacking filesystems (Intended to be a non-functional change) (6) Switch posix acl handling in stacking filesystems to new posix acl api now that all filesystems it can stack upon support it. (7) Switch vfs to new posix acl api (semantical change) (8) Remove all now unused helpers (9) Additional regression fixes reported after we merged this into linux-next Thanks to Seth for a lot of good discussion around this and encouragement and input from Christoph" * tag 'fs.acl.rework.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (36 commits) posix_acl: Fix the type of sentinel in get_acl orangefs: fix mode handling ovl: call posix_acl_release() after error checking evm: remove dead code in evm_inode_set_acl() cifs: check whether acl is valid early acl: make vfs_posix_acl_to_xattr() static acl: remove a slew of now unused helpers 9p: use stub posix acl handlers cifs: use stub posix acl handlers ovl: use stub posix acl handlers ecryptfs: use stub posix acl handlers evm: remove evm_xattr_acl_change() xattr: use posix acl api ovl: use posix acl api ovl: implement set acl method ovl: implement get acl method ecryptfs: implement set acl method ecryptfs: implement get acl method ksmbd: use vfs_remove_acl() acl: add vfs_remove_acl() ...
| * | 9p: use stub posix acl handlersChristian Brauner2022-10-203-126/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Now that 9p supports the get and set acl inode operations and the vfs has been switched to the new posi api, 9p can simply rely on the stub posix acl handlers. The custom xattr handlers and associated unused helpers can be removed. Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | 9p: implement set acl methodChristian Brauner2022-10-203-0/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. In order to build a type safe posix api around get and set acl we need all filesystem to implement get and set acl. So far 9p implemented a ->get_inode_acl() operation that didn't require access to the dentry in order to allow (limited) permission checking via posix acls in the vfs. Now that we have get and set acl inode operations that take a dentry argument we can give 9p get and set acl inode operations. This is mostly a light refactoring of the codepaths currently used in 9p posix acl xattr handler. After we have fully implemented the posix acl api and switched the vfs over to it, the 9p specific posix acl xattr handler and associated code will be removed. Note, until the vfs has been switched to the new posix acl api this patch is a non-functional change. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | 9p: implement get acl methodChristian Brauner2022-10-203-22/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. In order to build a type safe posix api around get and set acl we need all filesystem to implement get and set acl. So far 9p implemented a ->get_inode_acl() operation that didn't require access to the dentry in order to allow (limited) permission checking via posix acls in the vfs. Now that we have get and set acl inode operations that take a dentry argument we can give 9p get and set acl inode operations. This is mostly a refactoring of the codepaths currently used in 9p posix acl xattr handler. After we have fully implemented the posix acl api and switched the vfs over to it, the 9p specific posix acl xattr handler and associated code will be removed. Note, until the vfs has been switched to the new posix acl api this patch is a non-functional change. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: rename current get acl methodChristian Brauner2022-10-201-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. The current inode operation for getting posix acls takes an inode argument but various filesystems (e.g., 9p, cifs, overlayfs) need access to the dentry. In contrast to the ->set_acl() inode operation we cannot simply extend ->get_acl() to take a dentry argument. The ->get_acl() inode operation is called from: acl_permission_check() -> check_acl() -> get_acl() which is part of generic_permission() which in turn is part of inode_permission(). Both generic_permission() and inode_permission() are called in the ->permission() handler of various filesystems (e.g., overlayfs). So simply passing a dentry argument to ->get_acl() would amount to also having to pass a dentry argument to ->permission(). We should avoid this unnecessary change. So instead of extending the existing inode operation rename it from ->get_acl() to ->get_inode_acl() and add a ->get_acl() method later that passes a dentry argument and which filesystems that need access to the dentry can implement instead of ->get_inode_acl(). Filesystems like cifs which allow setting and getting posix acls but not using them for permission checking during lookup can simply not implement ->get_inode_acl(). This is intended to be a non-functional change. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Suggested-by/Inspired-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
* / use less confusing names for iov_iter direction initializersAl Viro2022-11-253-5/+5
|/ | | | | | | | | | | | | READ/WRITE proved to be actively confusing - the meanings are "data destination, as used with read(2)" and "data source, as used with write(2)", but people keep interpreting those as "we read data from it" and "we write data to it", i.e. exactly the wrong way. Call them ITER_DEST and ITER_SOURCE - at least that is harder to misinterpret... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 9p: Fix some kernel-doc commentsYang Li2022-07-021-2/+2
| | | | | | | | | | | | | | | Remove warnings found by running scripts/kernel-doc, which is caused by using 'make W=1'. fs/9p/fid.c:35: warning: Function parameter or member 'pfid' not described in 'v9fs_fid_add' fs/9p/fid.c:35: warning: Excess function parameter 'fid' description in 'v9fs_fid_add' fs/9p/fid.c:80: warning: Function parameter or member 'pfid' not described in 'v9fs_open_fid_add' fs/9p/fid.c:80: warning: Excess function parameter 'fid' description in 'v9fs_open_fid_add' Link: https://lkml.kernel.org/r/20220615012039.43479-1-yang.lee@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> [Dominique: further adjust comment] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* 9p fid refcount: cleanup p9_fid_put callsDominique Martinet2022-07-026-90/+64
| | | | | | | | | | | | | | Simplify p9_fid_put cleanup path in many 9p functions since the function is noop on null or error fids. Also make the *_add_fid() helpers "steal" the fid by nulling its pointer, so put after them will be noop. This should lead to no change of behaviour Link: https://lkml.kernel.org/r/20220612085330.1451496-7-asmadeus@codewreck.org Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>