summaryrefslogtreecommitdiffstats
path: root/fs/namei.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* [PATCH] pass dentry to audit_inode()/audit_inode_child()Al Viro2007-10-211-5/+5
| | | | | | makes caller simpler *and* allows to scan ancestors Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* VFS: allow filesystems to implement atomic open+truncateMiklos Szeredi2007-10-181-2/+4
| | | | | | | | | | | | | | | | | Add a new attribute flag ATTR_OPEN, with the meaning: "truncation was initiated by open() due to the O_TRUNC flag". This way filesystems wanting to implement truncation within their ->open() method can ignore such truncate requests. This is a quick & dirty hack, but it comes for free. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* r/o bind mounts: give permission() a local 'mnt' variableDave Hansen2007-10-171-1/+5
| | | | | | | | | | | First of all, this makes the structure jumping look a little bit cleaner. So, this stands alone as a tiny cleanup. But, we also need 'mnt' by itself a few more times later in this series, so this isn't _just_ a cleanup. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* r/o bind mounts: rearrange may_open() to be r/o friendlyDave Hansen2007-10-171-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | may_open() calls vfs_permission() before it does checks for IS_RDONLY(inode). It checks _again_ inside of vfs_permission(). The check inside of vfs_permission() is going away eventually. With the mnt_want/drop_write() functions, all of the r/o checks (except for this one) are consistently done before calling permission(). Because of this, I'd like to use permission() to hold a debugging check to make sure that the mnt_want/drop_write() calls are actually being made. So, to do this: 1. remove the IS_RDONLY() check from permission() 2. enforce that you must mnt_want_write() before even calling permission() 3. actually add the debugging check to permission() We need to rearrange may_open() to do r/o checks before calling permission(). Here's the patch. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fix execute checking in permission()Miklos Szeredi2007-10-171-12/+24
| | | | | | | | | | | | | | | | | | | | | permission() checks that MAY_EXEC is only allowed on regular files if at least one execute bit is set in the file mode. generic_permission() already ensures this, so the extra check in permission() is superfluous. If the filesystem defines it's own ->permission() the check may still be needed. In this case move it after ->permission(). This is needed because filesystems such as FUSE may need to refresh the inode attributes before checking permissions. This check should be moved inside ->permission(), but that's another story. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Clean up duplicate includes in fs/Jesper Juhl2007-10-171-1/+0
| | | | | | | | | This patch cleans up duplicate includes in fs/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* partially fix up the lookup_one_noperm messChristoph Hellwig2007-10-171-22/+36
| | | | | | | | | | | | | | | | | | | | | | | Try to fix the mess created by sysfs braindamage. - refactor code internal to fs/namei.c a little to avoid too much duplication: o __lookup_hash_kern is renamed back to __lookup_hash o the old __lookup_hash goes away, permission checks moves to the two callers o useless inline qualifiers on above functions go away - lookup_one_len_kern loses it's last argument and is renamed to lookup_one_noperm to make it's useage a little more clear - added kerneldoc comments to describe lookup_one_len aswell as lookup_one_noperm and make it very clear that no one should use the latter ever. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs: introduce write_begin, write_end, and perform_write aopsNick Piggin2007-10-161-35/+11
| | | | | | | | | | | | | | | These are intended to replace prepare_write and commit_write with more flexible alternatives that are also able to avoid the buffered write deadlock problems efficiently (which prepare_write is unable to do). [mark.fasheh@oracle.com: API design contributions, code review and fixes] [akpm@linux-foundation.org: various fixes] [dmonakhov@sw.ru: new aop block_write_begin fix] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs: remove path_walk exportJosef 'Jeff' Sipek2007-07-191-2/+1
| | | | | | | | | | | Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@suse.de> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs: mark link_path_walk staticJosef 'Jeff' Sipek2007-07-191-1/+3
| | | | | | | | | | | Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@suse.de> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs: introduce vfs_path_lookupJosef 'Jeff' Sipek2007-07-191-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stackable file systems, among others, frequently need to lookup paths or path components starting from an arbitrary point in the namespace (identified by a dentry and a vfsmount). Currently, such file systems use lookup_one_len, which is frowned upon [1] as it does not pass the lookup intent along; not passing a lookup intent, for example, can trigger BUG_ON's when stacking on top of NFSv4. The first patch introduces a new lookup function to allow lookup starting from an arbitrary point in the namespace. This approach has been suggested by Christoph Hellwig [2]. The second patch changes sunrpc to use vfs_path_lookup. The third patch changes nfsctl.c to use vfs_path_lookup. The fourth patch marks link_path_walk static. The fifth, and last patch, unexports path_walk because it is no longer unnecessary to call it directly, and using the new vfs_path_lookup is cleaner. For example, the following snippet of code, looks up "some/path/component" in a directory pointed to by parent_{dentry,vfsmnt}: err = vfs_path_lookup(parent_dentry, parent_vfsmnt, "some/path/component", 0, &nd); if (!err) { /* exits */ ... /* once done, release the references */ path_release(&nd); } else if (err == -ENOENT) { /* doesn't exist */ } else { /* other error */ } VFS functions such as lookup_create can be used on the nameidata structure to pass the create intent to the file system. Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@suse.de> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid checkSatyam Sharma2007-07-171-1/+1
| | | | | | | | | | | | | | | | | | | | Introduce is_owner_or_cap() macro in fs.h, and convert over relevant users to it. This is done because we want to avoid bugs in the future where we check for only effective fsuid of the current task against a file's owning uid, without simultaneously checking for CAP_FOWNER as well, thus violating its semantics. [ XFS uses special macros and structures, and in general looked ... untouchable, so we leave it alone -- but it has been looked over. ] The (current->fsuid != inode->i_uid) check in generic_permission() and exec_permission_lite() is left alone, because those operations are covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone. Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Cc: Al Viro <viro@ftp.linux.org.uk> Acked-by: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] complete message queue auditingAmy Griffis2007-05-111-1/+1
| | | | | | | | | | Handle the edge cases for POSIX message queue auditing. Collect inode info when opening an existing mq, and for send/receive operations. Remove audit_inode_update() as it has really evolved into the equivalent of audit_inode(). Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fs: use path_walk in do_path_lookupJosef 'Jeff' Sipek2007-05-091-2/+2
| | | | | | | | | | Since path_walk sets the total_link_count to 0 and calls link_path_walk, we can just call path_walk directly. Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs: fix indentation in do_path_lookupJosef 'Jeff' Sipek2007-05-091-3/+1
| | | | | | | Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* header cleaning: don't include smp_lock.h when not usedRandy Dunlap2007-05-081-1/+0
| | | | | | | | | | | | Remove includes of <linux/smp_lock.h> where it is not used/needed. Suggested by Al Viro. Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc, sparc64, and arm (all 59 defconfigs). Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* namei.c: remove utterly outdated commentChristoph Hellwig2007-05-081-11/+0
| | | | | | | | | We don't have a routine called namei() anymore since at least 2.3.x, and the comment is just totally out of sync with the current lookup logic. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: make read_cache_page synchronousNick Piggin2007-05-071-11/+1
| | | | | | | | | | | | | | | | Ensure pages are uptodate after returning from read_cache_page, which allows us to cut out most of the filesystem-internal PageUptodate calls. I didn't have a great look down the call chains, but this appears to fixes 7 possible use-before uptodate in hfs, 2 in hfsplus, 1 in jfs, a few in ecryptfs, 1 in jffs2, and a possible cleared data overwritten with readpage in block2mtd. All depending on whether the filler is async and/or can return with a !uptodate page. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* security: prevent permission checking of file removal via sysfs_remove_group()James Morris2007-04-271-20/+52
| | | | | | | | | | | | | | | | | | | Prevent permission checking from being performed when the kernel wants to unconditionally remove a sysfs group, by introducing an kernel-only variant of lookup_one_len(), lookup_one_len_kern(). Additionally, as sysfs_remove_group() does not check the return value of the lookup before using it, a BUG_ON has been added to pinpoint the cause of any problems potentially caused by this (and as a form of annotation). Signed-off-by: James Morris <jmorris@namei.org> Cc: Nagendra Singh Tomar <nagendra_tomar@adaptec.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* [PATCH] __page_symlink retry loop error code fixDmitriy Monakhov2007-02-161-1/+2
| | | | | | | | | | If prepare_write or commit_write return AOP_TRUNCATED_PAGE we jump to "retry" label and than if find_or_create_page() failed function return incorrect error code. Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] mark struct inode_operations const 2Arjan van de Ven2007-02-121-1/+1
| | | | | | | | | | | Many struct inode_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] VFS: change struct file to use struct pathJosef "Jeff" Sipek2006-12-081-5/+5
| | | | | | | | | | | | | This patch changes struct file to use struct path instead of having independent pointers to struct dentry and struct vfsmount, and converts all users of f_{dentry,vfsmnt} in fs/ to use f_path.{dentry,mnt}. Additionally, it adds two #define's to make the transition easier for users of the f_dentry and f_vfsmnt. Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] struct path: move struct path from fs/namei.c into include/linuxJosef "Jeff" Sipek2006-12-081-5/+0
| | | | | | | | | | | Moved struct path from fs/namei.c to include/linux/namei.h. This allows many places in the VFS, as well as any stackable filesystem to easily keep track of dentry-vfsmount pairs. Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] VFS: extra check inside dentry_unhash()Vasily Averin2006-12-071-2/+1
| | | | | | | | d_count check after dget() is always true. Signed-off-by: Vasily Averin <vvs@sw.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] honour MNT_NOEXEC for access()Stas Sergeev2006-12-071-2/+4
| | | | | | | | | | | | | | Make access(X_OK) take the "noexec" mount option into account. Signed-off-by: Stas Sergeev <stsp@aknet.ru> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Hugh Dickins <hugh@veritas.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] r/o bind mount prepwork: move open_namei()'s vfs_create()Dave Hansen2006-10-011-10/+20
| | | | | | | | | | | | | | | | | The code around vfs_create() in open_namei() is getting a bit too complex. Right now, there is at least the reference count on the dentry, and the i_mutex to worry about. Soon, we'll also have mnt_writecount. So, break the vfs_create() call out of open_namei(), and into a helper function. This duplicates the call to may_open(), but that isn't such a bad thing since the arguments (acc_mode and flag) were being heavily massaged anyway. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] r/o bind mounts: prepare for write access checks: collapse if()Dave Hansen2006-10-011-43/+50
| | | | | | | | | | | | | We're shortly going to be adding a bunch more permission checks in these functions. That requires adding either a bunch of new if() conditions, or some gotos. This patch collapses existing if()s and uses gotos instead to prepare for the upcoming changes. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fs/namei.c: replace multiple current->fs by shortcut variableAndreas Mohr2006-09-291-39/+47
| | | | | | | | | | | | Replace current->fs by fs helper variable to reduce some indirection overhead and (at least at the moment, before the current_thread_info() %gs PDA improvement is available) get rid of more costly current references. Reduces fs/namei.o from 37786 to 37082 Bytes (704 Bytes saved). [akpm@osdl.org: cleanup] Signed-off-by: Andreas Mohr <andi@lisas.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] autofs4 needs to force fail return revalidateIan Kent2006-09-271-15/+35
| | | | | | | | | | | | | | | | | | | | | | For a long time now I have had a problem with not being able to return a lookup failure on an existsing directory. In autofs this corresponds to a mount failure on a autofs managed mount entry that is browsable (and so the mount point directory exists). While this problem has been present for a long time I've avoided resolving it because it was not very visible. But now that autofs v5 has "mount and expire on demand" of nested multiple mounts, such as is found when mounting an export list from a server, solving the problem cannot be avoided any longer. I've tried very hard to find a way to do this entirely within the autofs4 module but have not been able to find a satisfactory way to achieve it. So, I need to propose a change to the VFS. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Allow file systems to manually d_move() inside of ->rename()Mark Fasheh2006-09-241-3/+3
| | | | | | | | | | | | | | | | Some file systems want to manually d_move() the dentries involved in a rename. We can do this by making use of the FS_ODD_RENAME flag if we just have nfs_rename() unconditionally do the d_move(). While there, we rename the flag to be more descriptive. OCFS2 uses this to protect that part of the rename operation with a cluster lock. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
* VFS: Fix access("file", X_OK) in the presence of ACLsTrond Myklebust2006-08-241-1/+8
| | | | | | | | | | | | | Currently, the access() call will return incorrect information on NFS if there exists an ACL that grants execute access to the user on a regular file. The reason the information is incorrect is that the VFS overrides this execute access in open_exec() by checking (inode->i_mode & 0111). This patch propagates the VFS execute bit check back into the generic permission() call. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 64cbae98848c4c99851cb0a405f0b4982cd76c1e commit)
* VFS: add lookup hint for network file systemsASANO Masahiro2006-08-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | I'm trying to speeding up mkdir(2) for network file systems. A typical mkdir(2) calls two inode_operations: lookup and mkdir. The lookup operation would fail with ENOENT in common case. I think it is unnecessary because the subsequent mkdir operation can check it. In case of creat(2), lookup operation is called with the LOOKUP_CREATE flag, so individual filesystem can omit real lookup. e.g. nfs_lookup(). Here is a sample patch which uses LOOKUP_CREATE and O_EXCL on mkdir, symlink and mknod. This uses the gadget for creat(2). And here is the result of a benchmark on NFSv3. mkdir(2) 10,000 times: original 50.5 sec patched 29.0 sec Signed-off-by: ASANO Masahiro <masano@tnes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from fab7bf44449b29f9d5572a5dd8adcf7c91d5bf0f commit)
* [PATCH] don't bother with aux entires for dummy contextAl Viro2006-08-031-2/+2
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [PATCH] fix missed create event for directory auditAmy Griffis2006-08-031-1/+1
| | | | | | | | | | When an object is created via a symlink into an audited directory, audit misses the event due to not having collected the inode data for the directory. Modify __audit_inode_child() to copy the parent inode data if a parent wasn't found in audit_names[]. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [PATCH] fix faulty inode data collection for open() with O_CREATAmy Griffis2006-08-031-0/+2
| | | | | | | | | | | When the specified path is an existing file or when it is a symlink, audit collects the wrong inode number, which causes it to miss the open() event. Adding a second hook to the open() path fixes this. Also add audit_copy_inode() to consolidate some code. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [PATCH] struct file leakageKirill Korotaev2006-07-151-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 2.6.16 leaks like hell. While testing, I found massive leakage (reproduced in openvz) in: *filp *size-4096 And 1 object leaks in *size-32 *size-64 *size-128 It is the fix for the first one. filp leaks in the bowels of namei.c. Seems, size-4096 is file table leaking in expand_fdtables. I have no idea what are the rest and why they show only accompanying another leaks. Some debugging structs? [akpm@osdl.org, Trond: remove the IS_ERR() check] Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Kirill Korotaev <dev@openvz.org> Cc: <stable@kernel.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] lockdep: annotate i_mutexIngo Molnar2006-07-041-10/+10
| | | | | | | | | | Teach special (recursive) locking code to the lock validator. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Implement AT_SYMLINK_FOLLOW flag for linkatUlrich Drepper2006-06-251-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the linkat() syscall was added the flag parameter was added in the last minute but it wasn't used so far. The following patch should change that. My tests show that this is all that's needed. If OLDNAME is a symlink setting the flag causes linkat to follow the symlink and create a hardlink with the target. This is actually the behavior POSIX demands for link() as well but Linux wisely does not do this. With this flag (which will most likely be in the next POSIX revision) the programmer can choose the behavior, defaulting to the safe variant. As a side effect it is now possible to implement a POSIX-compliant link(2) function for those who are interested. touch file ln -s file symlink linkat(fd, "symlink", fd, "newlink", 0) -> newlink is hardlink of symlink linkat(fd, "symlink", fd, "newlink", AT_SYMLINK_FOLLOW) -> newlink is hardlink of file The value of AT_SYMLINK_FOLLOW is determined by the definition we already use in glibc. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] read_mapping_page for address spacePekka Enberg2006-06-231-2/+1
| | | | | | | | | | Add read_mapping_page() which is used for callers that pass mapping->a_ops->readpage as the filler for read_cache_page. This removes some duplication from filesystem code. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] log more info for directory entry change eventsAmy Griffis2006-06-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | When an audit event involves changes to a directory entry, include a PATH record for the directory itself. A few other notable changes: - fixed audit_inode_child() hooks in fsnotify_move() - removed unused flags arg from audit_inode() - added audit log routines for logging a portion of a string Here's some sample output. before patch: type=SYSCALL msg=audit(1149821605.320:26): arch=40000003 syscall=39 success=yes exit=0 a0=bf8d3c7c a1=1ff a2=804e1b8 a3=bf8d3c7c items=1 ppid=739 pid=800 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255 type=CWD msg=audit(1149821605.320:26): cwd="/root" type=PATH msg=audit(1149821605.320:26): item=0 name="foo" parent=164068 inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0 after patch: type=SYSCALL msg=audit(1149822032.332:24): arch=40000003 syscall=39 success=yes exit=0 a0=bfdd9c7c a1=1ff a2=804e1b8 a3=bfdd9c7c items=2 ppid=714 pid=777 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255 type=CWD msg=audit(1149822032.332:24): cwd="/root" type=PATH msg=audit(1149822032.332:24): item=0 name="/root" inode=164068 dev=03:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_dir_t:s0 type=PATH msg=audit(1149822032.332:24): item=1 name="foo" inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0 Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [PATCH] fs/namei.c: Call to file_permission() under a spinlock in ↵Trond Myklebust2006-06-051-9/+10
| | | | | | | | | | | | | | | | | | | | do_lookup_path() From: Trond Myklebust <Trond.Myklebust@netapp.com> We're presently running lock_kernel() under fs_lock via nfs's ->permission handler. That's a ranking bug and sometimes a sleep-in-spinlock bug. This problem was introduced in the openat() patchset. We should not need to hold the current->fs->lock for a codepath that doesn't use current->fs. [vsu@altlinux.ru: fix error path] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Al Viro <viro@ftp.linux.org.uk> Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fs/namei.c: make lookup_hash() staticAdrian Bunk2006-03-311-2/+1
| | | | | | | | | As announced, lookup_hash() can now become static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] autofs4: nameidata needs to be up to date for follow_linkIan Kent2006-03-271-18/+21
| | | | | | | | | | | | | | | | | | | | | | | | In order to be able to trigger a mount using the follow_link inode method the nameidata struct that is passed in needs to have the vfsmount of the autofs trigger not its parent. During a path walk if an autofs trigger is mounted on a dentry, when the follow_link method is called, the nameidata struct contains the vfsmount and mountpoint dentry of the parent mount while the dentry that is passed in is the root of the autofs trigger mount. I believe it is impossible to get the vfsmount of the trigger mount, within the follow_link method, when only the parent vfsmount and the root dentry of the trigger mount are known. This patch updates the nameidata struct on entry to __do_follow_link if it detects that it is out of date. It moves the path_to_nameidata to above __do_follow_link to facilitate calling it from there. The dput_path is moved as well as that seemed sensible. No changes are made to these two functions. Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge branch 'audit.b3' of ↵Linus Torvalds2006-03-251-5/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (22 commits) [PATCH] fix audit_init failure path [PATCH] EXPORT_SYMBOL patch for audit_log, audit_log_start, audit_log_end and audit_format [PATCH] sem2mutex: audit_netlink_sem [PATCH] simplify audit_free() locking [PATCH] Fix audit operators [PATCH] promiscuous mode [PATCH] Add tty to syscall audit records [PATCH] add/remove rule update [PATCH] audit string fields interface + consumer [PATCH] SE Linux audit events [PATCH] Minor cosmetic cleanups to the code moved into auditfilter.c [PATCH] Fix audit record filtering with !CONFIG_AUDITSYSCALL [PATCH] Fix IA64 success/failure indication in syscall auditing. [PATCH] Miscellaneous bug and warning fixes [PATCH] Capture selinux subject/object context information. [PATCH] Exclude messages by message type [PATCH] Collect more inode information during syscall processing. [PATCH] Pass dentry, not just name, in fsnotify creation hooks. [PATCH] Define new range of userspace messages. [PATCH] Filter rule comparators ... Fixed trivial conflict in security/selinux/hooks.c
| * [PATCH] Collect more inode information during syscall processing.Amy Griffis2006-03-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch augments the collection of inode info during syscall processing. It represents part of the functionality that was provided by the auditfs patch included in RHEL4. Specifically, it: - Collects information for target inodes created or removed during syscalls. Previous code only collects information for the target inode's parent. - Adds the audit_inode() hook to syscalls that operate on a file descriptor (e.g. fchown), enabling audit to do inode filtering for these calls. - Modifies filtering code to check audit context for either an inode # or a parent inode # matching a given rule. - Modifies logging to provide inode # for both parent and child. - Protect debug info from NULL audit_names.name. [AV: folded a later typo fix from the same author] Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * [PATCH] Pass dentry, not just name, in fsnotify creation hooks.Amy Griffis2006-03-201-5/+5
| | | | | | | | | | | | | | | | The audit hooks (to be added shortly) will want to see dentry->d_inode too, not just the name. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* | [PATCH] Honour AOP_TRUNCATE_PAGE returns in page_symlinkNeilBrown2006-03-251-2/+14
| | | | | | | | | | | | | | | | | | As prepare_write, commit_write and readpage are allowed to return AOP_TRUNCATE_PAGE, page_symlink should respond to them. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] Missed error checking for intent's filp in open_namei().Oleg Drokin2006-03-251-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems there is error check missing in open_namei for errors returned through intent.open.file (from lookup_instantiate_filp). If there is plain open performed, then such a check done inside __path_lookup_intent_open called from path_lookup_open(), but when the open is performed with O_CREAT flag set, then __path_lookup_intent_open is only called with LOOKUP_PARENT set where no file opening can occur yet. Later on lookup_hash is called where exact opening might take place and intent.open.file may be filled. If it is filled with error value of some sort, then we get kernel attempting to dereference this error value as address (and corresponding oops) in nameidata_to_filp() called from filp_open(). While this is relatively simple to workaround in ->lookup() method by just checking lookup_instantiate_filp() return value and returning error as needed, this is not so easy in ->d_revalidate(), where we can only return "yes, dentry is valid" or "no, dentry is invalid, perform full lookup again", and just returning 0 on error would cause extra lookup (with potential extra costly RPCs). So in short, I believe that there should be no difference in error handling for opening a file and creating a file in open_namei() and propose this simple patch as a solution. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] sem2mutex: vfs_rename_mutexArjan van de Ven2006-03-231-6/+6
|/ | | | | | | | | | | | | | Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ext3: ext3_symlink should use GFP_NOFS allocations insideKirill Korotaev2006-03-111-2/+11
| | | | | | | | | | | | | | | | This patch fixes illegal __GFP_FS allocation inside ext3 transaction in ext3_symlink(). Such allocation may re-enter ext3 code from try_to_free_pages. But JBD/ext3 code keeps a pointer to current journal handle in task_struct and, hence, is not reentrable. This bug led to "Assertion failure in journal_dirty_metadata()" messages. http://bugzilla.openvz.org/show_bug.cgi?id=115 Signed-off-by: Andrey Savochkin <saw@saw.sw.com.sg> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>