summaryrefslogtreecommitdiffstats
path: root/fs (follow)
Commit message (Collapse)AuthorAgeFilesLines
* vfs: Block mmapped writes while the fs is frozenJan Kara2011-05-261-1/+23
| | | | | | | | | | | | | | | | | | | | We should not allow file modification via mmap while the filesystem is frozen. So block in block_page_mkwrite() while the filesystem is frozen. We cannot do the blocking wait in __block_page_mkwrite() since e.g. ext4 will want to call that function with transaction started in some cases and that would deadlock. But we can at least do the non-blocking reliable check in __block_page_mkwrite() which is the hardest part anyway. We have to check for frozen filesystem with the page marked dirty and under page lock with which we then return from ->page_mkwrite(). Only that way we cannot race with writeback done by freezing code - either we mark the page dirty after the writeback has started, see freezing in progress and block, or writeback will wait for our page lock which is released only when the fault is done and then writeback will writeout and writeprotect the page again. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* vfs: Create __block_page_mkwrite() helper passing error values backJan Kara2011-05-261-17/+20
| | | | | | | | | | Create __block_page_mkwrite() helper which does all what block_page_mkwrite() does except that it passes back errors from __block_write_begin / block_commit_write calls. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fs/namespace.c: bound mount propagation fixRoman Borisov2011-05-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This issue was discovered by users of busybox. And the bug is actual for busybox users, I don't know how it affects others. Apparently, mount is called with and without MS_SILENT, and this affects mount() behaviour. But MS_SILENT is only supposed to affect kernel logging verbosity. The following script was run in an empty test directory: mkdir -p mount.dir mount.shared1 mount.shared2 touch mount.dir/a mount.dir/b mount -vv --bind mount.shared1 mount.shared1 mount -vv --make-rshared mount.shared1 mount -vv --bind mount.shared2 mount.shared2 mount -vv --make-rshared mount.shared2 mount -vv --bind mount.shared2 mount.shared1 mount -vv --bind mount.dir mount.shared2 ls -R mount.dir mount.shared1 mount.shared2 umount mount.dir mount.shared1 mount.shared2 2>/dev/null umount mount.dir mount.shared1 mount.shared2 2>/dev/null umount mount.dir mount.shared1 mount.shared2 2>/dev/null rm -f mount.dir/a mount.dir/b mount.dir/c rmdir mount.dir mount.shared1 mount.shared2 mount -vv was used to show the mount() call arguments and result. Output shows that flag argument has 0x00008000 = MS_SILENT bit: mount: mount('mount.shared1','mount.shared1','(null)',0x00009000,'(null)'):0 mount: mount('','mount.shared1','',0x0010c000,''):0 mount: mount('mount.shared2','mount.shared2','(null)',0x00009000,'(null)'):0 mount: mount('','mount.shared2','',0x0010c000,''):0 mount: mount('mount.shared2','mount.shared1','(null)',0x00009000,'(null)'):0 mount: mount('mount.dir','mount.shared2','(null)',0x00009000,'(null)'):0 mount.dir: a b mount.shared1: mount.shared2: a b After adding --loud option to remove MS_SILENT bit from just one mount cmd: mkdir -p mount.dir mount.shared1 mount.shared2 touch mount.dir/a mount.dir/b mount -vv --bind mount.shared1 mount.shared1 2>&1 mount -vv --make-rshared mount.shared1 2>&1 mount -vv --bind mount.shared2 mount.shared2 2>&1 mount -vv --loud --make-rshared mount.shared2 2>&1 # <-HERE mount -vv --bind mount.shared2 mount.shared1 2>&1 mount -vv --bind mount.dir mount.shared2 2>&1 ls -R mount.dir mount.shared1 mount.shared2 2>&1 umount mount.dir mount.shared1 mount.shared2 2>/dev/null umount mount.dir mount.shared1 mount.shared2 2>/dev/null umount mount.dir mount.shared1 mount.shared2 2>/dev/null rm -f mount.dir/a mount.dir/b mount.dir/c rmdir mount.dir mount.shared1 mount.shared2 The result is different now - look closely at mount.shared1 directory listing. Now it does show files 'a' and 'b': mount: mount('mount.shared1','mount.shared1','(null)',0x00009000,'(null)'):0 mount: mount('','mount.shared1','',0x0010c000,''):0 mount: mount('mount.shared2','mount.shared2','(null)',0x00009000,'(null)'):0 mount: mount('','mount.shared2','',0x00104000,''):0 mount: mount('mount.shared2','mount.shared1','(null)',0x00009000,'(null)'):0 mount: mount('mount.dir','mount.shared2','(null)',0x00009000,'(null)'):0 mount.dir: a b mount.shared1: a b mount.shared2: a b The analysis shows that MS_SILENT flag which is ON by default in any busybox-> mount operations cames to flags_to_propagation_type function and causes the error return while is_power_of_2 checking because the function expects only one bit set. This doesn't allow to do busybox->mount with any --make-[r]shared, --make-[r]private etc options. Moreover, the recently added flags_to_propagation_type() function doesn't allow us to do such operations as --make-[r]private --make-[r]shared etc. when MS_SILENT is on. The idea or clearing the MS_SILENT flag came from to Denys Vlasenko. Signed-off-by: Roman Borisov <ext-roman.borisov@nokia.com> Reported-by: Denys Vlasenko <vda.linux@googlemail.com> Cc: Chuck Ebbert <cebbert@redhat.com> Cc: Alexander Shishkin <virtuoso@slind.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* exportfs: reallow building as a moduleJonas Gorski2011-05-261-1/+1
| | | | | | | | | | | | | Commit 990d6c2d7aee921e3bce22b2d6a750fd552262be ("vfs: Add name to file handle conversion support") changed EXPORTFS to be a bool. This was needed for earlier revisions of the original patch, but the actual commit put the code needing it into its own file that only gets compiled when FHANDLE is selected which in turn selects EXPORTFS. So EXPORTFS can be safely compiled as a module when not selecting FHANDLE. Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Acked-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* merge handle_reval_dot and nameidata_drop_rcu_lastAl Viro2011-05-261-81/+40
| | | | | | | | | | | new helper: complete_walk(). Done on successful completion of walk, drops out of RCU mode, does d_revalidate of final result if that hadn't been done already. handle_reval_dot() and nameidata_drop_rcu_last() subsumed into that one; callers converted to use of complete_walk(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* consolidate nameidata_..._drop_rcu()Al Viro2011-05-261-105/+46
| | | | | | | Merge these into a single function (unlazy_walk(nd, dentry)), kill ..._maybe variants Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6Linus Torvalds2011-04-288-110/+142
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: nfs: don't lose MS_SYNCHRONOUS on remount of noac mount NFS: Return meaningful status from decode_secinfo() NFSv4: Ensure we request the ordinary fileid when doing readdirplus NFSv4: Ensure that clientid and session establishment can time out SUNRPC: Allow RPC calls to return ETIMEDOUT instead of EIO NFSv4.1: Don't loop forever in nfs4_proc_create_session NFSv4: Handle NFS4ERR_WRONGSEC outside of nfs4_handle_exception() NFSv4.1: Don't update sequence number if rpc_task is not sent NFSv4.1: Ensure state manager thread dies on last umount SUNRPC: Fix the SUNRPC Kerberos V RPCSEC_GSS module dependencies NFS: Use correct variable for page bounds checking NFS: don't negotiate when user specifies sec flavor NFS: Attempt mount with default sec flavor first NFS: flav_array honors NFS_MAX_SECFLAVORS NFS: Fix infinite loop in gss_create_upcall() Don't mark_inode_dirty_sync() while holding lock NFS: Get rid of pointless test in nfs_commit_done NFS: Remove unused argument from nfs_find_best_sec() NFS: Eliminate duplicate call to nfs_mark_request_dirty NFS: Remove dead code from nfs_fs_mount()
| * nfs: don't lose MS_SYNCHRONOUS on remount of noac mountJeff Layton2011-04-271-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a remount, the VFS layer will clear the MS_SYNCHRONOUS bit on the assumption that the flags on the mount syscall will have it set if the remounted fs is supposed to keep it. In the case of "noac" though, MS_SYNCHRONOUS is implied. A remount of such a mount will lose the MS_SYNCHRONOUS flag since "sync" isn't part of the mount options. Reported-by: Max Matveev <makc@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Return meaningful status from decode_secinfo()Bryan Schumaker2011-04-271-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When compiling, I was getting this warning: fs/nfs/nfs4xdr.c: In function ‘decode_secinfo’: fs/nfs/nfs4xdr.c:4839:6: warning: variable ‘status’ set but not used [-Wunused-but-set-variable] We were unconditionally returning 0 as long as there wasn't an error coming out of xdr_inline_decode(). We probably want to check the error status coming out of decode_op_hdr() and decode_secinfo_gss(), rather than assuming that everything is OK all the time. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFSv4: Ensure we request the ordinary fileid when doing readdirplusTrond Myklebust2011-04-271-17/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When readdir() returns a directory entry for the root of a mounted filesystem, Linux follows the old convention of returning the inode number of the covered directory (despite newer versions of POSIX declaring that this is a bug). To ensure this continues to work, the NFSv4 readdir implementation requests the 'mounted-on-fileid' from the server. However, readdirplus also needs to instantiate an inode for this entry, and for that, we also need to request the real fileid as per this patch. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFSv4: Ensure that clientid and session establishment can time outTrond Myklebust2011-04-242-6/+8
| | | | | | | | | | | | | | | | | | The following patch ensures that we do not get permanently trapped in the RPC layer when trying to establish a new client id or session. This again ensures that the state manager can finish in a timely fashion when the last filesystem to reference the nfs_client exits. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFSv4.1: Don't loop forever in nfs4_proc_create_sessionTrond Myklebust2011-04-243-53/+39
| | | | | | | | | | | | | | | | | | | | | | If a server for some reason keeps sending NFS4ERR_DELAY errors, we can end up looping forever inside nfs4_proc_create_session, and so the usual mechanisms for detecting if the nfs_client is dead don't work. Fix this by ensuring that we loop inside the nfs4_state_manager thread instead. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFSv4: Handle NFS4ERR_WRONGSEC outside of nfs4_handle_exception()Bryan Schumaker2011-04-181-5/+23
| | | | | | | | | | | | | | | | | | | | I only want to try other secflavors during an initial mount if NFS4ERR_WRONGSEC is returned. nfs4_handle_exception() could potentially map other errors to EPERM, so we should handle this error specially for correctness. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFSv4.1: Don't update sequence number if rpc_task is not sentBryan Schumaker2011-04-181-2/+2
| | | | | | | | | | | | | | | | | | | | If we fail to contact the gss upcall program, then no message will be sent to the server. The client still updated the sequence number, however, and this lead to NFS4ERR_SEQ_MISMATCH for the next several RPC calls. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFSv4.1: Ensure state manager thread dies on last umountTrond Myklebust2011-04-161-2/+2
| | | | | | | | | | | | | | | | Currently, the state manager may continue to try recovering state forever even after the last filesystem to reference that nfs_client has umounted. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
| * NFS: Use correct variable for page bounds checkingBryan Schumaker2011-04-131-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While decoding a secinfo reply, I store the list of supported sec flavors on a page accessible through res->flavors. Before reading each new flavor, I do some math to determine if there is enough space left on this page, and I break out of my read look if there isn't. In order to perform this check correctly, I need to use the address of res->flavors, rather than the address of res. When this loop was broken early I lied to the caller and told them that the entire list had been decoded. This could lead to problems if the caller tries to use any the garbage data claiming to be a valid sec flavor. I fixed this by using res->flavors->num_flavors as a counter, incrementing it every time a sec flavor is successfully decoded. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: don't negotiate when user specifies sec flavorBryan Schumaker2011-04-132-1/+3
| | | | | | | | | | | | | | | | | | | | We were always attempting sec flavor negotiation, even if the user told us a specific sec flavor to use. If that sec flavor fails, we should return an error rather than continuing with sec flavor negotiation. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Attempt mount with default sec flavor firstBryan Schumaker2011-04-131-8/+16
| | | | | | | | | | | | | | | | | | | | nfs4_lookup_root() is already configured to use either RPC_AUTH_UNIX or a user specified flavor (through -o sec=<whatever>). We should use this flavor first, and only attempt negotiation if it fails with -EPERM. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: flav_array honors NFS_MAX_SECFLAVORSBryan Schumaker2011-04-131-1/+1
| | | | | | | | | | | | | | | | | | NFS_MAX_SECFLAVORS should already take into account RPC_AUTH_UNIX and RPC_AUTH_NULL, so we don't need to set aside extra slots for them. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Fix infinite loop in gss_create_upcall()Bryan Schumaker2011-04-131-2/+3
| | | | | | | | | | | | | | | | | | | | | | There can be an infinite loop if gss_create_upcall() is called without the userspace program running. To prevent this, we return -EACCES if we notice that pipe_version hasn't changed (indicating that the pipe has not been opened). Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * Don't mark_inode_dirty_sync() while holding lockWeston Andros Adamson2011-04-131-1/+7
| | | | | | | | | | | | | | | | | | | | | | mark_inode_dirty_sync() grabs the same inode lock! race conditions between holding the lock in pnfs_set_layoutcommit() and in mark_inode_dirty_sync() can result in a second call to pnfs_layoutcommit_inode(), but this will be a noop as NFS_INO_LAYOUTCOMMIT won't be set in the second call Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Get rid of pointless test in nfs_commit_doneTrond Myklebust2011-04-131-2/+1
| | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Remove unused argument from nfs_find_best_sec()Bryan Schumaker2011-04-131-2/+2
| | | | | | | | | | | | | | | | The inode was used in an earlier version of the code, but it isn't used anymore. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Eliminate duplicate call to nfs_mark_request_dirtyTrond Myklebust2011-04-131-1/+0
| | | | | | | | | | | | We only need to call nfs_mark_request_dirty() once in nfs_writepage_setup(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Remove dead code from nfs_fs_mount()Jesper Juhl2011-04-131-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In fs/nfs/super.c::nfs_fs_mount() we test for a NULL 'data': ... if (data == NULL || mntfh == NULL) goto out_free_fh; ... and then further down in the function we test 'data' again: ... nfs_fscache_get_super_cookie( s, data ? data->fscache_uniq : NULL, NULL); ... this second check is just dead code since there is no way 'data' could possibly be NULL here. We also rely on a non-NULL 'data' in more than one location between these two tests, further proving the point that the second test is bogus. This patch removes the dead code. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | vfs: avoid large kmalloc()s for the fdtableAndrew Morton2011-04-281-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Azurit reports large increases in system time after 2.6.36 when running Apache. It was bisected down to a892e2d7dcdfa6c76e6 ("vfs: use kmalloc() to allocate fdmem if possible"). That patch caused the vfs to use kmalloc() for very large allocations and this is causing excessive work (and presumably excessive reclaim) within the page allocator. Fix it by falling back to vmalloc() earlier - when the allocation attempt would have been considered "costly" by reclaim. Reported-by: azurIt <azurit@pobox.sk> Tested-by: azurIt <azurit@pobox.sk> Acked-by: Changli Gao <xiaosuo@gmail.com> Cc: Americo Wang <xiyou.wangcong@gmail.com> Cc: Jiri Slaby <jslaby@suse.cz> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Revert wrong fixes for common misspellingsLucas De Marchi2011-04-272-2/+2
| | | | | | | | | | | | | | These changes were incorrectly fixed by codespell. They were now manually corrected. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds2011-04-266-15/+32
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: cleanup error handling in inode.c Btrfs: put the right bio if we have an error Btrfs: free bitmaps properly when evicting the cache Btrfs: Free free_space item properly in btrfs_trim_block_group() btrfs: add missing spin_unlock to a rare exit path Btrfs: check return value of kmalloc() btrfs: fix wrong allocating flag when reading page Btrfs: fix missing mutex_unlock in btrfs_del_dir_entries_in_log()
| * | Btrfs: cleanup error handling in inode.cTsutomu Itoh2011-04-261-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | The error processing of several places is changed like setting the error number only at the error. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Btrfs: put the right bio if we have an errorJosef Bacik2011-04-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | In btrfs_submit_direct_hook if the first btrfs_map_block fails we need to put the orig_bio, not bio. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Btrfs: free bitmaps properly when evicting the cacheJosef Bacik2011-04-261-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If our space cache is wrong, we do the right thing and free up everything that we loaded, however we don't reset the total_bitmaps counter or the thresholds or anything. So in btrfs_remove_free_space_cache make sure to call free_bitmap() if it's a bitmap, this will keep us from panicing when we check to make sure we don't have too many bitmaps. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Btrfs: Free free_space item properly in btrfs_trim_block_group()Li Zefan2011-04-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Since commit dc89e9824464e91fa0b06267864ceabe3186fd8b, we've changed to use a specific slab for alocation of free_space items. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | btrfs: add missing spin_unlock to a rare exit pathDavid Sterba2011-04-261-0/+1
| | | | | | | | | | | | | | | Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Btrfs: check return value of kmalloc()Tsutomu Itoh2011-04-262-0/+7
| | | | | | | | | | | | | | | | | | | | | The check on the return value of kmalloc() is added to some places. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | btrfs: fix wrong allocating flag when reading pageItaru Kitayama2011-04-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the space cache use extent_readpages() to read free space information, so we can not use GFP_KERNEL flag to allocate memory, or it may lead to deadlock. Signed-off-by: Itaru Kitayama <kitayama@cl.bb4u.ne.jp> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | Btrfs: fix missing mutex_unlock in btrfs_del_dir_entries_in_log()Tsutomu Itoh2011-04-261-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | It is necessary to unlock mutex_lock before it return an error when btrfs_alloc_path() fails. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2011-04-261-0/+10
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: do some plugging in the submit_bio threads
| * | | Btrfs: do some plugging in the submit_bio threadsChris Mason2011-04-201-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Btrfs submit bio threads have a small number of threads responsible for pushing down bios we've collected for a large number of devices. Since we do all the bios for a single device at once, we want to make sure we unplug and send down the bios for each device as we're done processing them. The new plugging API removed the btrfs code to unplug while processing bios, this adds it back with the new API. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds2011-04-261-2/+3
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: CIFS: Fix memory over bound bug in cifs_parse_mount_options
| * | | | CIFS: Fix memory over bound bug in cifs_parse_mount_optionsPavel Shilovsky2011-04-211-2/+3
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While password processing we can get out of options array bound if the next character after array is delimiter. The patch adds a check if we reach the end. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2011-04-267-79/+128
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6: eCryptfs: Flush dirty pages in setattr eCryptfs: Handle failed metadata read in lookup eCryptfs: Add reference counting to lower files eCryptfs: dput dentries returned from dget_parent eCryptfs: Remove extra d_delete in ecryptfs_rmdir
| * | | | eCryptfs: Flush dirty pages in setattrTyler Hicks2011-04-261-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After 57db4e8d73ef2b5e94a3f412108dff2576670a8a changed eCryptfs to write-back caching, eCryptfs page writeback updates the lower inode times due to the use of vfs_write() on the lower file. To preserve inode metadata changes, such as 'cp -p' does with utimensat(), we need to flush all dirty pages early in ecryptfs_setattr() so that the user-updated lower inode metadata isn't clobbered later in writeback. https://bugzilla.kernel.org/show_bug.cgi?id=33372 Reported-by: Rocko <rockorequin@hotmail.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| * | | | eCryptfs: Handle failed metadata read in lookupTyler Hicks2011-04-264-16/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When failing to read the lower file's crypto metadata during a lookup, eCryptfs must continue on without throwing an error. For example, there may be a plaintext file in the lower mount point that the user wants to delete through the eCryptfs mount. If an error is encountered while reading the metadata in lookup(), the eCryptfs inode's size could be incorrect. We must be sure to reread the plaintext inode size from the metadata when performing an open() or setattr(). The metadata is already being read in those paths, so this adds minimal performance overhead. This patch introduces a flag which will track whether or not the plaintext inode size has been read so that an incorrect i_size can be fixed in the open() or setattr() paths. https://bugs.launchpad.net/bugs/509180 Cc: <stable@kernel.org> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| * | | | eCryptfs: Add reference counting to lower filesTyler Hicks2011-04-266-59/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For any given lower inode, eCryptfs keeps only one lower file open and multiplexes all eCryptfs file operations through that lower file. The lower file was considered "persistent" and stayed open from the first lookup through the lifetime of the inode. This patch keeps the notion of a single, per-inode lower file, but adds reference counting around the lower file so that it is closed when not currently in use. If the reference count is at 0 when an operation (such as open, create, etc.) needs to use the lower file, a new lower file is opened. Since the file is no longer persistent, all references to the term persistent file are changed to lower file. Locking is added around the sections of code that opens the lower file and assign the pointer in the inode info, as well as the code the fputs the lower file when all eCryptfs users are done with it. This patch is needed to fix issues, when mounted on top of the NFSv3 client, where the lower file is left silly renamed until the eCryptfs inode is destroyed. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| * | | | eCryptfs: dput dentries returned from dget_parentTyler Hicks2011-04-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Call dput on the dentries previously returned by dget_parent() in ecryptfs_rename(). This is needed for supported eCryptfs mounts on top of the NFSv3 client. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| * | | | eCryptfs: Remove extra d_delete in ecryptfs_rmdirTyler Hicks2011-04-261-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vfs_rmdir() already calls d_delete() on the lower dentry. That was being duplicated in ecryptfs_rmdir() and caused a NULL pointer dereference when NFSv3 was the lower filesystem. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
* | | | | add hlist_bl_lock/unlock helpersChristoph Hellwig2011-04-262-20/+8
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the whole dcache_hash_bucket crap is gone, go all the way and also remove the weird locking layering violations for locking the hash buckets. Add hlist_bl_lock/unlock helpers to move the locking into the list abstraction instead of requiring each caller to open code it. After all allowing for the bit locks is the whole point of these helpers over the plain hlist variant. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'dcache-cleanup'Linus Torvalds2011-04-241-26/+16
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | * dcache-cleanup: vfs: get rid of insane dentry hashing rules
| * | | | vfs: get rid of insane dentry hashing rulesLinus Torvalds2011-04-241-26/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dentry hashing rules have been really quite complicated for a long while, in odd ways. That made functions like __d_drop() very fragile and non-obvious. In particular, whether a dentry was hashed or not was indicated with an explicit DCACHE_UNHASHED bit. That's despite the fact that the hash abstraction that the dentries use actually have a 'is this entry hashed or not' model (which is a simple test of the 'pprev' pointer). The reason that was done is because we used the normal 'is this entry unhashed' model to mark whether the dentry had _ever_ been hashed in the dentry hash tables, and that logic goes back many years (commit b3423415fbc2: "dcache: avoid RCU for never-hashed dentries"). That, in turn, meant that __d_drop had totally different unhashing logic for the dentry hash table case and for the anonymous dcache case, because in order to use the "is this dentry hashed" logic as a flag for whether it had ever been on the RCU hash table, we had to unhash such a dentry differently so that we'd never think that it wasn't 'unhashed' and wouldn't be free'd correctly. That's just insane. It made the logic really hard to follow, when there were two different kinds of "unhashed" states, and one of them (the one that used "list_bl_unhashed()") really had nothing at all to do with being unhashed per se, but with a very subtle lifetime rule instead. So turn all of it around, and make it logical. Instead of having a DENTRY_UNHASHED bit in d_flags to indicate whether the dentry is on the hash chains or not, use the hash chain unhashed logic for that. Suddenly "d_unhashed()" just uses "list_bl_unhashed()", and everything makes sense. And for the lifetime rule, just use an explicit DENTRY_RCUACCEES bit. If we ever insert the dentry into the dentry hash table so that it is visible to RCU lookup, we mark it DENTRY_RCUACCESS to show that it now needs the RCU lifetime rules. Now suddently that test at dentry free time makes sense too. And because unhashing now is sane and doesn't depend on where the dentry got unhashed from (because the dentry hash chain details doesn't have some subtle side effects), we can re-unify the __d_drop() logic and use common code for the unhashing. Also fix one more open-coded hash chain bit_spin_lock() that I missed in the previous chain locking cleanup commit. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | Merge branch 'for-linus' of git://git.infradead.org/ubifs-2.6Linus Torvalds2011-04-242-8/+47
|\ \ \ \ \ | |/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.infradead.org/ubifs-2.6: UBIFS: fix master node recovery UBIFS: fix false assertion warning in case of I/O failures UBIFS: fix false space checking failure