summaryrefslogtreecommitdiffstats
path: root/fs/nfsd/vfs.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* nfsd: fix double-locks of directory mutexJ. Bruce Fields2009-01-071-3/+31
| | | | | | | | | | | | | | | A number of nfsd operations depend on the i_mutex to cover more code than just the fsync, so the approach of 4c728ef583b3d8 "add a vfs_fsync helper" doesn't work for nfsd. Revert the parts of those patches that touch nfsd. Note: we can't, however, remove the logic from vfs_fsync that was needed only for the special case of nfsd, because a vfs_fsync(NULL,...) call can still result indirectly from a stackable filesystem that was called by nfsd. (Thanks to Christoph Hellwig for pointing this out.) Reported-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* add a vfs_fsync helperChristoph Hellwig2009-01-051-32/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fsync currently has a fdatawrite/fdatawait pair around the method call, and a mutex_lock/unlock of the inode mutex. All callers of fsync have to duplicate this, but we have a few and most of them don't quite get it right. This patch adds a new vfs_fsync that takes care of this. It's a little more complicated as usual as ->fsync might get a NULL file pointer and just a dentry from nfsd, but otherwise gets afile and we want to take the mapping and file operations from it when it is there. Notes on the fsync callers: - ecryptfs wasn't calling filemap_fdatawrite / filemap_fdatawait on the lower file - coda wasn't calling filemap_fdatawrite / filemap_fdatawait on the host file, and returning 0 when ->fsync was missing - shm wasn't calling either filemap_fdatawrite / filemap_fdatawait nor taking i_mutex. Now given that shared memory doesn't have disk backing not doing anything in fsync seems fine and I left it out of the vfs_fsync conversion for now, but in that case we might just not pass it through to the lower file at all but just call the no-op simple_sync_file directly. [and now actually export vfs_fsync] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* inode->i_op is never NULLAl Viro2009-01-051-4/+4
| | | | | | | | | | We used to have rather schizophrenic set of checks for NULL ->i_op even though it had been eliminated years ago. You'd need to go out of your way to set it to NULL explicitly _and_ a bunch of code would die on such inodes anyway. After killing two remaining places that still did that bogosity, all that crap can go away. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge branch 'master' into nextJames Morris2008-11-141-4/+1
|\ | | | | | | | | | | | | | | | | | | | | Conflicts: security/keys/internal.h security/keys/process_keys.c security/keys/request_key.c Fixed conflicts above by using the non 'tsk' versions. Signed-off-by: James Morris <jmorris@namei.org>
| * Fix nfsd truncation of readdir resultsDoug Nazar2008-11-091-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 8d7c4203 "nfsd: fix failure to set eof in readdir in some situations" introduced a bug: on a directory in an exported ext3 filesystem with dir_index unset, a READDIR will only return about 250 entries, even if the directory was larger. Bisected it back to this commit; reverting it fixes the problem. It turns out that in this case ext3 reads a block at a time, then returns from readdir, which means we can end up with buf.full==0 but with more entries in the directory still to be read. Before 8d7c4203 (but after c002a6c797 "Optimise NFS readdir hack slightly"), this would cause us to return the READDIR result immediately, but with the eof bit unset. That could cause a performance regression (because the client would need more roundtrips to the server to read the whole directory), but no loss in correctness, since the cleared eof bit caused the client to send another readdir. After 8d7c4203, the setting of the eof bit made this a correctness problem. So, move nfserr_eof into the loop and remove the buf.full check so that we loop until buf.used==0. The following seems to do the right thing and reduces the network traffic since we don't return a READDIR result until the buffer is full. Tested on an empty directory & large directory; eof is properly sent and there are no more short buffers. Signed-off-by: Doug Nazar <nazard@dragoninc.ca> Cc: David Woodhouse <David.Woodhouse@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* | CRED: Pass credentials through dentry_open()David Howells2008-11-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Pass credentials through dentry_open() so that the COW creds patch can have SELinux's flush_unauthorized_files() pass the appropriate creds back to itself when it opens its null chardev. The security_dentry_open() call also now takes a creds pointer, as does the dentry_open hook in struct security_operations. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: James Morris <jmorris@namei.org>
* | CRED: Wrap task credential accesses in the NFS daemonDavid Howells2008-11-141-3/+3
|/ | | | | | | | | | | | | | | | | | | Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: linux-nfs@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
* nfsd: fix failure to set eof in readdir in some situationsJ. Bruce Fields2008-10-301-0/+1
| | | | | | | | | | | | | | | | | Before 14f7dd632011bb89c035722edd6ea0d90ca6b078 "[PATCH] Copy XFS readdir hack into nfsd code", readdir_cd->err was reset to eof before each call to vfs_readdir; afterwards, it is set only once. Similarly, c002a6c7977320f95b5edede5ce4e0eeecf291ff "[PATCH] Optimise NFS readdir hack slightly", can cause us to exit without nfserr_eof set. Fix this. This ensures the "eof" bit is set when needed in readdir replies. (The particular case I saw was an nfsv4 readdir of an empty directory, which returned with no entries (the protocol requires "." and ".." to be filtered out), but with eof unset.) Cc: David Woodhouse <David.Woodhouse@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* Merge branch 'for-linus' of ↵Linus Torvalds2008-10-231-16/+110
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (46 commits) [PATCH] fs: add a sanity check in d_free [PATCH] i_version: remount support [patch] vfs: make security_inode_setattr() calling consistent [patch 1/3] FS_MBCACHE: don't needlessly make it built-in [PATCH] move executable checking into ->permission() [PATCH] fs/dcache.c: update comment of d_validate() [RFC PATCH] touch_mnt_namespace when the mount flags change [PATCH] reiserfs: add missing llseek method [PATCH] fix ->llseek for more directories [PATCH vfs-2.6 6/6] vfs: add LOOKUP_RENAME_TARGET intent [PATCH vfs-2.6 5/6] vfs: remove LOOKUP_PARENT from non LOOKUP_PARENT lookup [PATCH vfs-2.6 4/6] vfs: remove unnecessary fsnotify_d_instantiate() [PATCH vfs-2.6 3/6] vfs: add __d_instantiate() helper [PATCH vfs-2.6 2/6] vfs: add d_ancestor() [PATCH vfs-2.6 1/6] vfs: replace parent == dentry->d_parent by IS_ROOT() [PATCH] get rid of on-stack dentry in udf [PATCH 2/2] anondev: switch to IDA [PATCH 1/2] anondev: init IDR statically [JFFS2] Use d_splice_alias() not d_add() in jffs2_lookup() [PATCH] Optimise NFS readdir hack slightly. ...
| * [PATCH] Optimise NFS readdir hack slightly.David Woodhouse2008-10-231-2/+3
| | | | | | | | | | | | | | | | | | | | | | Avoid calling the underlying ->readdir() again when we reached the end already; keep going round the loop only if we stopped due to our own buffer being full. [AV: tidy the things up a bit, while we are there] Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * [PATCH] prepare vfs_readdir() callers to returning filldir resultAl Viro2008-10-231-2/+9
| | | | | | | | | | | | | | It's not the final state, but it allows moving ->readdir() instances to passing filldir return value to caller of vfs_readdir(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * [PATCH] Copy XFS readdir hack into nfsd code.David Woodhouse2008-10-231-15/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some file systems with their own internal locking have problems with the way that nfsd calls the ->lookup() method from within a filldir function called from their ->readdir() method. The recursion back into the file system code can cause deadlock. XFS has a fairly hackish solution to this which involves doing the readdir() into a locally-allocated buffer, then going back through it calling the filldir function afterwards. It's not ideal, but it works. It's particularly suboptimal because XFS does this for local file systems too, where it's completely unnecessary. Copy this hack into the NFS code where it can be used only for NFS export. In response to feedback, use it unconditionally rather than only for the affected file systems. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * [PATCH] Factor out nfsd_do_readdir() into its own functionDavid Woodhouse2008-10-231-16/+24
| | | | | | | | | | Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | nfsd: Fix memory leak in nfsd_getxattrKrishna Kumar2008-10-221-1/+5
|/ | | | | | | | Fix a memory leak in nfsd_getxattr. nfsd_getxattr should free up memory that it allocated if vfs_getxattr fails. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* knfsd: allocate readahead cache in individual chunksJeff Layton2008-09-291-24/+35
| | | | | | | | | | | | | | | | | | | I had a report from someone building a large NFS server that they were unable to start more than 585 nfsd threads. It was reported against an older kernel using the slab allocator, and I tracked it down to the large allocation in nfsd_racache_init failing. It appears that the slub allocator handles large allocations better, but large contiguous allocations can often be problematic. There doesn't seem to be any reason that the racache has to be allocated as a single large chunk. This patch breaks this up so that the racache is built up from separate allocations. (Thanks also to Takashi Iwai for a bugfix.) Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Takashi Iwai <tiwai@suse.de>
* nfsd: permit unauthenticated stat of export rootJ. Bruce Fields2008-09-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | RFC 2623 section 2.3.2 permits the server to bypass gss authentication checks for certain operations that a client may perform when mounting. In the case of a client that doesn't have some form of credentials available to it on boot, this allows it to perform the mount unattended. (Presumably real file access won't be needed until a user with credentials logs in.) Being slightly more lenient allows lots of old clients to access krb5-only exports, with the only loss being a small amount of information leaked about the root directory of the export. This affects only v2 and v3; v4 still requires authentication for all access. Thanks to Peter Staubach testing against a Solaris client, which suggesting addition of v3 getattr, to the list, and to Trond for noting that doing so exposes no additional information. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Peter Staubach <staubach@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
* [PATCH] kill nameidata passing to permission(), rename to inode_permission()Al Viro2008-07-271-2/+2
| | | | | | | Incidentally, the name that gives hundreds of false positives on grep is not a good idea... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [patch 5/5] vfs: remove mode parameter from vfs_symlink()Miklos Szeredi2008-07-271-8/+2
| | | | | | | | | Remove the unused mode parameter from vfs_symlink and callers. Thanks to Tetsuo Handa for noticing. CC: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
* nfsd: clean up mnt_want_write callsMiklos Szeredi2008-07-011-14/+11
| | | | | | | | | Multiple mnt_want_write() calls in the switch statement looks really ugly. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: rename MAY_ flagsMiklos Szeredi2008-06-231-56/+59
| | | | | | | | | | | Rename nfsd_permission() specific MAY_* flags to NFSD_MAY_* to make it clear, that these are not used outside nfsd, and to avoid name and number space conflicts with the VFS. [comment from hch: rename MAY_READ, MAY_WRITE and MAY_EXEC as well] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* knfsd: clear both setuid and setgid whenever a chown is doneJeff Layton2008-04-231-13/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, knfsd only clears the setuid bit if the owner of a file is changed on a SETATTR call, and only clears the setgid bit if the group is changed. POSIX says this in the spec for chown(): "If the specified file is a regular file, one or more of the S_IXUSR, S_IXGRP, or S_IXOTH bits of the file mode are set, and the process does not have appropriate privileges, the set-user-ID (S_ISUID) and set-group-ID (S_ISGID) bits of the file mode shall be cleared upon successful return from chown()." If I'm reading this correctly, then knfsd is doing this wrong. It should be clearing both the setuid and setgid bit on any SETATTR that changes the uid or gid. This wasn't really as noticable before, but now that the ATTR_KILL_S*ID bits are a no-op for the NFS client, it's more evident. This patch corrects the nfsd_setattr logic so that this occurs. It also does a bit of cleanup to the function. There is also one small behavioral change. If a SETATTR call comes in that changes the uid/gid and the mode, then we now only clear the setgid bit if the group execute bit isn't set. The setgid bit without a group execute bit signifies mandatory locking and we likely don't want to clear the bit in that case. Since there is no call in POSIX that should generate a SETATTR call like this, then this should rarely happen, but it's worth noting. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* knfsd: get rid of imode variable in nfsd_setattrJeff Layton2008-04-231-3/+1
| | | | | | | ...it's not really needed. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: fix sparse warning in vfs.cHarvey Harrison2008-04-231-1/+1
| | | | | | | fs/nfsd/vfs.c:991:27: warning: Using plain integer as NULL pointer Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* make nfsd_create_setattr() staticAdrian Bunk2008-04-231-1/+1
| | | | | | | This patch makes the needlessly global nfsd_create_setattr() static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* [PATCH] r/o bind mounts: check mnt instead of superblock directlyDave Hansen2008-04-191-2/+3
| | | | | | | | | | | | | | | | | | | If we depend on the inodes for writeability, we will not catch the r/o mounts when implemented. This patches uses __mnt_want_write(). It does not guarantee that the mount will stay writeable after the check. But, this is OK for one of the checks because it is just for a printk(). The other two are probably unnecessary and duplicate existing checks in the VFS. This won't make them better checks than before, but it will make them detect r/o mounts. Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [PATCH] r/o bind mounts: elevate write count for xattr_permission() callersDave Hansen2008-04-191-0/+4
| | | | | | | | | | | | | | This basically audits the callers of xattr_permission(), which calls permission() and can perform writes to the filesystem. [AV: add missing parts - removexattr() and nfsd posix acls, plug for a leak spotted by Miklos] Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [PATCH] r/o bind mounts: get write access for vfs_rename() callersDave Hansen2008-04-191-4/+13
| | | | | | | | | | | This also uses the little helper in the NFS code to make an if() a little bit less ugly. We introduced the helper at the beginning of the series. Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [PATCH] r/o bind mounts: write counts for link/symlinkDave Hansen2008-04-191-1/+13
| | | | | | | | | [AV: add missing nfsd pieces] Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [PATCH] r/o bind mounts: get callers of vfs_mknod/create/mkdir()Dave Hansen2008-04-191-2/+22
| | | | | | | | | | | | | | | | | | | | | | This takes care of all of the direct callers of vfs_mknod(). Since a few of these cases also handle normal file creation as well, this also covers some calls to vfs_create(). So that we don't have to make three mnt_want/drop_write() calls inside of the switch statement, we move some of its logic outside of the switch and into a helper function suggested by Christoph. This also encapsulates a fix for mknod(S_IFREG) that Miklos found. [AV: merged mkdir handling, added missing nfsd pieces] Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [PATCH] r/o bind mounts: elevate write count for rmdir and unlink.Dave Hansen2008-04-191-1/+7
| | | | | | | | | | | | | Elevate the write count during the vfs_rmdir() and vfs_unlink(). [AV: merged rmdir and unlink parts, added missing pieces in nfsd] Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Use struct path in struct svc_exportJan Blunck2008-02-151-6/+7
| | | | | | | | | | | | | | | | | I'm embedding struct path into struct svc_export. [akpm@linux-foundation.org: coding-style fixes] [ezk@cs.sunysb.edu: NFSD: fix wrong mnt_writer count in rename] Signed-off-by: Jan Blunck <jblunck@suse.de> Acked-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* nfsd: allow root to set uid and gid on createJ. Bruce Fields2008-02-011-19/+28
| | | | | | | | | | | | | | | | | | | | The server silently ignores attempts to set the uid and gid on create. Based on the comment, this appears to have been done to prevent some overly-clever IRIX client from causing itself problems. Perhaps we should remove that hack completely. For now, at least, it makes sense to allow root (when no_root_squash is set) to set uid and gid. While we're there, since nfsd_create and nfsd_create_v3 share the same logic, pull that out into a separate function. And spell out the individual modifications of ia_valid instead of doing them both at once inside a conditional. Thanks to Roger Willcocks <roger@filmlight.ltd.uk> for the bug report and original patch on which this is based. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* NFSD: Adjust filename length argument of nfsd_lookupChuck Lever2008-02-011-2/+2
| | | | | | | | | | Clean up: adjust the sign of the length argument of nfsd_lookup and nfsd_lookup_dentry, for consistency with recent changes. NFSD version 4 callers already pass an unsigned file name length. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-By: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* Use helpers to obtain task pid in printksPavel Emelyanov2007-10-191-3/+3
| | | | | | | | | | | | | | | | The task_struct->pid member is going to be deprecated, so start using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in the kernel. The first thing to start with is the pid, printed to dmesg - in this case we may safely use task_pid_nr(). Besides, printks produce more (much more) than a half of all the explicit pid usage. [akpm@linux-foundation.org: git-drm went and changed lots of stuff] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* knfsd: only set ATTR_KILL_S*ID if ATTR_MODE isn't being explicitly setJeff Layton2007-10-181-6/+15
| | | | | | | | | | | | | | It's theoretically possible for a single SETATTR call to come in that sets the mode and the uid/gid. In that case, don't set the ATTR_KILL_S*ID bits since that would trip the BUG() in notify_change. Just fix up the mode to have the same effect. Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Implement file posix capabilitiesSerge E. Hallyn2007-10-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement file posix capabilities. This allows programs to be given a subset of root's powers regardless of who runs them, without having to use setuid and giving the binary all of root's powers. This version works with Kaigai Kohei's userspace tools, found at http://www.kaigai.gr.jp/index.php. For more information on how to use this patch, Chris Friedhoff has posted a nice page at http://www.friedhoff.org/fscaps.html. Changelog: Nov 27: Incorporate fixes from Andrew Morton (security-introduce-file-caps-tweaks and security-introduce-file-caps-warning-fix) Fix Kconfig dependency. Fix change signaling behavior when file caps are not compiled in. Nov 13: Integrate comments from Alexey: Remove CONFIG_ ifdef from capability.h, and use %zd for printing a size_t. Nov 13: Fix endianness warnings by sparse as suggested by Alexey Dobriyan. Nov 09: Address warnings of unused variables at cap_bprm_set_security when file capabilities are disabled, and simultaneously clean up the code a little, by pulling the new code into a helper function. Nov 08: For pointers to required userspace tools and how to use them, see http://www.friedhoff.org/fscaps.html. Nov 07: Fix the calculation of the highest bit checked in check_cap_sanity(). Nov 07: Allow file caps to be enabled without CONFIG_SECURITY, since capabilities are the default. Hook cap_task_setscheduler when !CONFIG_SECURITY. Move capable(TASK_KILL) to end of cap_task_kill to reduce audit messages. Nov 05: Add secondary calls in selinux/hooks.c to task_setioprio and task_setscheduler so that selinux and capabilities with file cap support can be stacked. Sep 05: As Seth Arnold points out, uid checks are out of place for capability code. Sep 01: Define task_setscheduler, task_setioprio, cap_task_kill, and task_setnice to make sure a user cannot affect a process in which they called a program with some fscaps. One remaining question is the note under task_setscheduler: are we ok with CAP_SYS_NICE being sufficient to confine a process to a cpuset? It is a semantic change, as without fsccaps, attach_task doesn't allow CAP_SYS_NICE to override the uid equivalence check. But since it uses security_task_setscheduler, which elsewhere is used where CAP_SYS_NICE can be used to override the uid equivalence check, fixing it might be tough. task_setscheduler note: this also controls cpuset:attach_task. Are we ok with CAP_SYS_NICE being used to confine to a cpuset? task_setioprio task_setnice sys_setpriority uses this (through set_one_prio) for another process. Need same checks as setrlimit Aug 21: Updated secureexec implementation to reflect the fact that euid and uid might be the same and nonzero, but the process might still have elevated caps. Aug 15: Handle endianness of xattrs. Enforce capability version match between kernel and disk. Enforce that no bits beyond the known max capability are set, else return -EPERM. With this extra processing, it may be worth reconsidering doing all the work at bprm_set_security rather than d_instantiate. Aug 10: Always call getxattr at bprm_set_security, rather than caching it at d_instantiate. [morgan@kernel.org: file-caps clean up for linux/capability.h] [bunk@kernel.org: unexport cap_inode_killpriv] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Andrew Morgan <morgan@kernel.org> Signed-off-by: Andrew Morgan <morgan@kernel.org> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* r/o bind mounts: create cleanup helper svc_msnfs()Dave Hansen2007-10-171-4/+11
| | | | | | | | | | | | I'm going to be modifying nfsd_rename() shortly to support read-only bind mounts. This #ifdef is around the area I'm patching, and it starts to get really ugly if I just try to add my new code by itself. Using this little helper makes things a lot cleaner to use. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'locks' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2007-10-161-7/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'locks' of git://linux-nfs.org/~bfields/linux: nfsd: remove IS_ISMNDLCK macro Rework /proc/locks via seq_files and seq_list helpers fs/locks.c: use list_for_each_entry() instead of list_for_each() NFS: clean up explicit check for mandatory locks AFS: clean up explicit check for mandatory locks 9PFS: clean up explicit check for mandatory locks GFS2: clean up explicit check for mandatory locks Cleanup macros for distinguishing mandatory locks Documentation: move locks.txt in filesystems/ locks: add warning about mandatory locking races Documentation: move mandatory locking documentation to filesystems/ locks: Fix potential OOPS in generic_setlease() Use list_first_entry in locks_wake_up_blocks locks: fix flock_lock_file() comment Memory shortage can result in inconsistent flocks state locks: kill redundant local variable locks: reverse order of posix_locks_conflict() arguments
| * nfsd: remove IS_ISMNDLCK macroJ. Bruce Fields2007-10-101-7/+6
| | | | | | | | | | | | | | This macro is only used in one place; in this place it seems simpler to put open-code it and move the comment to where it's used. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * Cleanup macros for distinguishing mandatory locksPavel Emelyanov2007-10-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The combination of S_ISGID bit set and S_IXGRP bit unset is used to mark the inode as "mandatory lockable" and there's a macro for this check called MANDATORY_LOCK(inode). However, fs/locks.c and some filesystems still perform the explicit i_mode checking. Besides, Andrew pointed out, that this macro is buggy itself, as it dereferences the inode arg twice. Convert this macro into static inline function and switch its users to it, making the code shorter and more readable. The __mandatory_lock() helper is to be used in places where the IS_MANDLOCK() for superblock is already known to be true. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: David Howells <dhowells@redhat.com> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | nfsd: fix horrible indentation in nfsd_setattrChristoph Hellwig2007-10-101-17/+26
|/ | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Neil Brown <neilb@suse.de>
* knfsd: Fixed problem with NFS exporting directories which are mounted on.Neil Brown2007-09-111-1/+2
| | | | | | | | | | | | | Recent changes in NFSd cause a directory which is mounted-on to not appear properly when the filesystem containing it is exported. *exp_get* now returns -ENOENT rather than NULL and when commit 5d3dbbeaf56d0365ac6b5c0a0da0bd31cc4781e1 removed the NULL checks, it didn't add a check for -ENOENT. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* knfsd: set the response bitmask for NFS4_CREATE_EXCLUSIVEJeff Layton2007-08-011-1/+4
| | | | | | | | | | | | | | | | | | RFC 3530 says: If the server uses an attribute to store the exclusive create verifier, it will signify which attribute by setting the appropriate bit in the attribute mask that is returned in the results. Linux uses the atime and mtime to store the verifier, but sends a zeroed out bitmask back to the client. This patch makes sure that we set the correct bits in the bitmask in this situation. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* knfsd: clean up EX_RDONLYJ. Bruce Fields2007-07-191-10/+3
| | | | | | | | | | Share a little common code, reverse the arguments for consistency, drop the unnecessary "inline", and lowercase the name. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* knfsd: move EX_RDONLY out of headerJ. Bruce Fields2007-07-191-0/+12
| | | | | | | | | EX_RDONLY is only called in one place; just put it there. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* nfsd: remove unnecessary NULL checks from nfsd_cross_mntJ. Bruce Fields2007-07-191-2/+2
| | | | | | | | | | We can now assume that rqst_exp_get_by_name() does not return NULL; so clean up some unnecessary checks. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* nfsd: fix possible read-ahead cache and export table corruptionJ. Bruce Fields2007-07-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | The value of nperbucket calculated here is too small--we should be rounding up instead of down--with the result that the index j in the following loop can overflow the raparm_hash array. At least in my case, the next thing in memory turns out to be export_table, so the symptoms I see are crashes caused by the appearance of four zeroed-out export entries in the first bucket of the hash table of exports (which were actually entries in the readahead cache, a pointer to which had been written to the export table in this initialization code). It looks like the bug was probably introduced with commit fce1456a19f5c08b688c29f00ef90fdfa074c79b ("knfsd: make the readahead params cache SMP-friendly"). Cc: <stable@kernel.org> Cc: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* knfsd: nfsd4: make readonly access depend on pseudoflavorJ. Bruce Fields2007-07-171-6/+7
| | | | | | | | | | | | Allow readonly access to vary depending on the pseudoflavor, using the flag passed with each pseudoflavor in the export downcall. The rest of the flags are ignored for now, though some day we might also allow id squashing to vary based on the flavor. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* knfsd: nfsd4: return nfserr_wrongsecAndy Adamson2007-07-171-0/+4
| | | | | | | | | | | | Make the first actual use of the secinfo information by using it to return nfserr_wrongsec when an export is found that doesn't allow the flavor used on this request. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* knfsd: nfsd: factor nfsd_lookup into 2 piecesJ. Bruce Fields2007-07-171-19/+36
| | | | | | | | | | | | | Factor nfsd_lookup into nfsd_lookup_dentry, which finds the right dentry and export, and a second part which composes the filehandle (and which will later check the security flavor on the new export). No change in behavior. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>