summaryrefslogtreecommitdiffstats
path: root/fs/nfs (follow)
Commit message (Collapse)AuthorAgeFilesLines
* NFSv4: Fix up delegation callbacksTrond Myklebust2008-12-231-19/+38
| | | | | | | Currently, the callback server is listening on IPv6 if it is enabled. This means that IPv4 addresses will always be mapped. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Return unreferenced delegations more promptlyTrond Myklebust2008-12-234-13/+49
| | | | | | | | | If the client is not using a delegation, the right thing to do is to return it as soon as possible. This helps reduce the amount of state the server has to track, as well as reducing the potential for conflicts with other clients. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Clean up the asynchronous delegation returnTrond Myklebust2008-12-231-54/+19
| | | | | | | Reuse the state management thread in order to return delegations when we get a callback. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Clean up nfs_expire_all_delegations()Trond Myklebust2008-12-234-34/+16
| | | | | | | Let the actual delegreturn stuff be run in the state manager thread rather than allocating a separate kthread. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Fix a BAD_SEQUENCEID condition.Trond Myklebust2008-12-231-2/+0
| | | | | | | | | | We really shouldn't be resetting the sequence ids when doing state expiration recovery, since we don't know if the server still remembers our previous state owners. There are servers out there that do attempt to preserve client state even if the lease has expired. Such a server would only release that state if a conflicting OPEN request occurs. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Don't exit the state management if there are still tasks to doTrond Myklebust2008-12-231-2/+6
| | | | | | Fix up a potential race... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Rename the state reclaimer threadTrond Myklebust2008-12-234-25/+32
| | | | | | It is really a more general purpose state management thread at this point. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Clean up NFS4ERR_CB_PATH_DOWN error management...Trond Myklebust2008-12-234-10/+16
| | | | | | | Add a delegation cleanup phase to the state management loop, and do the NFS4ERR_CB_PATH_DOWN recovery there. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Clean up the support for returning multiple delegationsTrond Myklebust2008-12-233-55/+52
| | | | | | | | | Add a flag to mark delegations as requiring return, then run a garbage collector. In the future, this will allow for more flexible delegation management, where delegations may be marked for return if it turns out that they are not being referenced. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Add recovery for individual stateidsTrond Myklebust2008-12-233-10/+31
| | | | | | | | | | | | | | | NFSv4 defines a number of state errors which the client does not currently handle. Among those we should worry about are: NFS4ERR_ADMIN_REVOKED - the server's administrator revoked our locks and/or delegations. NFS4ERR_BAD_STATEID - the client and server are out of sync, possibly due to a delegation return racing with an OPEN request. NFS4ERR_OPENMODE - the client attempted to do something not sanctioned by the open mode of the stateid. Should normally just occur as a result of a delegation return race. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Remove nfs_client->cl_semTrond Myklebust2008-12-235-45/+1
| | | | | | | | | | Now that we're using the flags to indicate state that needs to be recovered, as well as having implemented proper refcounting and spinlocking on the state and open_owners, we can get rid of nfs_client->cl_sem. The only remaining case that was dubious was the file locking, and that case is now covered by the nfsi->rwsem. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Ensure that file unlock requests don't conflict with state recoveryTrond Myklebust2008-12-232-10/+20
| | | | | | | | | | The unlock path is currently failing to take the nfs_client->cl_sem read lock, and hence the recovery path may see locks disappear from underneath it. Also ensure that it takes the nfs_inode->rwsem read lock so that it there is no conflict with delegation recalls. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Remove the unnecessary argument to nfs4_wait_clnt_recover()Trond Myklebust2008-12-231-76/+75
| | | | | | | ...and move some code around in order to clear out an unnecessary forward declaration. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Ensure that nfs4_reclaim_open_state() doesn't depend on cl_semTrond Myklebust2008-12-231-1/+10
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Add a recovery marking scheme for state ownersTrond Myklebust2008-12-233-4/+26
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Don't tell server we rebooted when not necessaryTrond Myklebust2008-12-231-11/+11
| | | | | | Instead of doing a full setclientid, try doing a RENEW call first. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Remove redundant RENEW calls if we know the lease has expiredTrond Myklebust2008-12-233-19/+32
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Fix state recovery when the client runs over the grace periodTrond Myklebust2008-12-233-49/+155
| | | | | | | | | If the client for some reason is not able to recover all its state within the time allotted for the grace period, and the server reboots again, the client is not allowed to recover the state that was 'lost' using reboot recovery. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Callers to nfs4_get_renew_cred() need to hold nfs_client->cl_lockTrond Myklebust2008-12-233-7/+17
| | | | | | Ditto for nfs4_get_setclientid_cred(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Clean up for the state loss reclaimerTrond Myklebust2008-12-231-49/+81
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Use atomic bitops when changing struct nfs_delegation->flagsTrond Myklebust2008-12-233-9/+9
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Fix up the dereferencing of delegation->inodeTrond Myklebust2008-12-231-8/+31
| | | | | | | | Without an extra lock, we cannot just assume that the delegation->inode is valid when we're traversing the rcu-protected nfs_client lists. Use the delegation->lock to ensure that it is truly valid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Fix up another delegation related raceTrond Myklebust2008-12-233-28/+60
| | | | | | | | | When we can update_open_stateid(), we need to be certain that we don't race with a delegation return. While we could do this by grabbing the nfs_client->cl_lock, a dedicated spin lock in the delegation structure will scale better. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NLM: allow lockd requests from an unprivileged portChuck Lever2008-12-231-0/+2
| | | | | | | | | | If the admin has specified the "noresvport" option for an NFS mount point, the kernel's NFS client uses an unprivileged source port for the main NFS transport. The kernel's lockd client should use an unprivileged port in this case as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: "[no]resvport" mount option changes mountd client tooChuck Lever2008-12-233-1/+5
| | | | | | | | | | If the admin has specified the "noresvport" option for an NFS mount point, the kernel's NFS client uses an unprivileged source port for the main NFS transport. The kernel's mountd client should use an unprivileged port in this case as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: add "[no]resvport" mount optionChuck Lever2008-12-232-5/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The standard default security setting for NFS is AUTH_SYS. An NFS client connects to NFS servers via a privileged source port and a fixed standard destination port (2049). The client sends raw uid and gid numbers to identify users making NFS requests, and the server assumes an appropriate authority on the client has vetted these values because the source port is privileged. On Linux, by default in-kernel RPC services use a privileged port in the range between 650 and 1023 to avoid using source ports of well- known IP services. Using such a small range limits the number of NFS mount points and the number of unique NFS servers to which a client can connect concurrently. An NFS client can use unprivileged source ports to expand the range of source port numbers, allowing more concurrent server connections and more NFS mount points. Servers must explicitly allow NFS connections from unprivileged ports for this to work. In the past, bumping the value of the sunrpc.max_resvport sysctl on the client would permit the NFS client to use unprivileged ports. Bumping this setting also changes the maximum port number used by other in-kernel RPC services, some of which still required a port number less than 1023. This is exacerbated by the way source port numbers are chosen by the Linux RPC client, which starts at the top of the range and works downwards. It means that bumping the maximum means all RPC services requesting a source port will likely get an unprivileged port instead of a privileged one. Changing this setting effects all NFS mount points on a client. A sysadmin could not selectively choose which mount points would use non-privileged ports and which could not. Lastly, this mechanism of expanding the limit on the number of NFS mount points was entirely undocumented. To address the need for the NFS client to use a large range of source ports without interfering with the activity of other in-kernel RPC services, we introduce a new NFS mount option. This option explicitly tells only the NFS client to use a non-privileged source port when communicating with the NFS server for one specific mount point. This new mount option is called "resvport," like the similar NFS mount option on FreeBSD and Mac OS X. A sister patch for nfs-utils will be submitted that documents this new option in nfs(5). The default setting for this new mount option requires the NFS client to use a privileged port, as before. Explicitly specifying the "noresvport" mount option allows the NFS client to use an unprivileged source port for this mount point when connecting to the NFS server port. This mount option is supported only for text-based NFS mounts. [ Sidebar: it is widely known that security mechanisms based on the use of privileged source ports are ineffective. However, the NFS client can combine the use of unprivileged ports with the use of secure authentication mechanisms, such as Kerberos. This allows a large number of connections and mount points while ensuring a useful level of security. Eventually we may change the default setting for this option depending on the security flavor used for the mount. For example, if the mount is using only AUTH_SYS, then the default setting will be "resvport;" if the mount is using a strong security flavor such as krb5, the default setting will be "noresvport." ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [Trond.Myklebust@netapp.com: Fixed a bug whereby nfs4_init_client() was being called with incorrect arguments.] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: move nfs_server flag initializationChuck Lever2008-12-231-8/+8
| | | | | | | | | | | | | Make it possible for the NFSv4 mount set up logic to pass mount option flags down the stack to nfs_create_rpc_client(). This is immediately useful if we want NFS mount options to modulate settings of the underlying RPC transport, but it may be useful at some later point if other parts of the NFSv4 mount initialization logic want to know what the mount options are. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: expand flags passed to nfs_create_rpc_client()Chuck Lever2008-12-231-4/+8
| | | | | | | | | | | | | The nfs_create_rpc_client() function sets up an RPC client for an NFS mount point. Add an option that allows it to set up an RPC transport from an unprivileged port. Instead of having nfs_create_rpc_client()'s callers retain local knowledge about how to set up an RPC client, create a couple of flag arguments to control the use of RPC_CLNT_CREATE flags. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: introduce nfs_mount_info struct for calling nfs_mount()Chuck Lever2008-12-234-40/+49
| | | | | | | | Clean up: convert nfs_mount() to take a single data structure argument to make it simpler to add more arguments. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Move declaration of nfs_mount() to fs/nfs/internal.hChuck Lever2008-12-232-0/+6
| | | | | | | | Clean up: The nfs_mount() function is not to be used outside of the NFS client. Move its public declaration to fs/nfs/internal.h. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: rename nfs_path variableChuck Lever2008-12-231-5/+5
| | | | | | | | | | Clean up: I'm about to move the declaration of nfs_mount into fs/nfs/internal.h and include it in fs/nfs/nfsroot.c. There's a conflicting definition of nfs_path in fs/nfs/internal.h and fs/nfs/nfsroot.c, so rename the private one. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs: remove redundant tests on reading new pagesWu Fengguang2008-12-231-6/+0
| | | | | | | | | | | | | | | | | | | | | aops->readpages() and its NFS helper readpage_async_filler() will only be called to do readahead I/O for newly allocated pages. So it's not necessary to test for the always 0 dirty/uptodate page flags. The removal of nfs_wb_page() call also fixes a readahead bug: the NFS readahead has been synchronous since 2.6.23, because that call will clear PG_readahead, which is the reminder for asynchronous readahead. More background: the PG_readahead page flag is shared with PG_reclaim, one for read path and the other for write path. clear_page_dirty_for_io() unconditionally clears PG_readahead to prevent possible readahead residuals, assuming itself to be always called in the write path. However, NFS is one and the only exception in that it _always_ calls clear_page_dirty_for_io() in the read path, i.e. for readpages()/readpage(). Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Wu Fengguang <wfg@linux.intel.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Convert nfs_attr_generation_counter into an atomic_longTrond Myklebust2008-10-281-8/+3
| | | | | | | | | The most important property we need from nfs_attr_generation_counter is monotonicity, which is not guaranteed by the current system of smp memory barriers. We should convert it to an atomic_long_t, and drop the memory barriers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Switch to a valid email address...Alan Cox2008-10-272-2/+2
| | | | | Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] move executable checking into ->permission()Miklos Szeredi2008-10-231-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | For execute permission on a regular files we need to check if file has any execute bits at all, regardless of capabilites. This check is normally performed by generic_permission() but was also added to the case when the filesystem defines its own ->permission() method. In the latter case the filesystem should be responsible for performing this check. Move the check from inode_permission() inside filesystems which are not calling generic_permission(). Create a helper function execute_ok() that returns true if the inode is a directory or if any execute bits are present in i_mode. Also fix up the following code: - coda control file is never executable - sysctl files are never executable - hfs_permission seems broken on MAY_EXEC, remove - hfsplus_permission is eqivalent to generic_permission(), remove Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
* [PATCH] switch all filesystems over to d_obtain_aliasChristoph Hellwig2008-10-231-8/+6
| | | | | | | Switch all users of d_alloc_anon to d_obtain_alias. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [PATCH] make O_EXCL in nd->intent.flags visible in nd->flagsAl Viro2008-10-231-4/+2
| | | | | | | New flag: LOOKUP_EXCL. Set before doing the final step of pathname resolution on the paths that have LOOKUP_CREATE and O_EXCL. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6Linus Torvalds2008-10-203-8/+21
|\ | | | | | | | | | | | | * git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: use correct fs type for v4 submounts and referrals Make nfs_file_cred more robust. NFS: Enable NFSv4 callback server to listen on AF_INET6 sockets
| * NFS: use correct fs type for v4 submounts and referralsAndy Adamson2008-10-171-2/+2
| | | | | | | | | | Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * Make nfs_file_cred more robust.Neil Brown2008-10-171-2/+4
| | | | | | | | | | | | | | | | | | As not all files have an associated open_context (e.g. device special files), it is safest to test for the existence of the open context before de-referencing it. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Enable NFSv4 callback server to listen on AF_INET6 socketsChuck Lever2008-10-171-4/+15
| | | | | | | | | | | | | | | | Allow the NFS callback server to listen for requests via an AF_INET6 or AF_INET socket when IPv6 support is present in the kernel. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | vmscan: split LRU lists into anon & file setsRik van Riel2008-10-201-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split the LRU lists in two, one set for pages that are backed by real file systems ("file") and one for pages that are backed by memory and swap ("anon"). The latter includes tmpfs. The advantage of doing this is that the VM will not have to scan over lots of anonymous pages (which we generally do not want to swap out), just to find the page cache pages that it should evict. This patch has the infrastructure and a basic policy to balance how much we scan the anon lists and how much we scan the file lists. The big policy changes are in separate patches. [lee.schermerhorn@hp.com: collect lru meminfo statistics from correct offset] [kosaki.motohiro@jp.fujitsu.com: prevent incorrect oom under split_lru] [kosaki.motohiro@jp.fujitsu.com: fix pagevec_move_tail() doesn't treat unevictable page] [hugh@veritas.com: memcg swapbacked pages active] [hugh@veritas.com: splitlru: BDI_CAP_SWAP_BACKED] [akpm@linux-foundation.org: fix /proc/vmstat units] [nishimura@mxp.nes.nec.co.jp: memcg: fix handling of shmem migration] [kosaki.motohiro@jp.fujitsu.com: adjust Quicklists field of /proc/meminfo] [kosaki.motohiro@jp.fujitsu.com: fix style issue of get_scan_ratio()] Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'next'Trond Myklebust2008-10-1514-211/+321
|\
| * NFS: Fix a resolution problem with nfs_inode->cache_change_attributeTrond Myklebust2008-10-152-2/+1
| | | | | | | | | | | | | | | | | | | | | | The cache_change_attribute is used to decide whether or not a directory has changed, in which case we may need to look it up again. Again, the use of 'jiffies' leads to an issue of resolution. Once again, the fix is to change nfs_inode->cache_change_attribute, and just make it a simple counter. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Fix the resolution problem with nfs_inode_attrs_need_update()Trond Myklebust2008-10-152-11/+40
| | | | | | | | | | | | | | | | | | | | | | | | It appears that 'jiffies' timestamps do not have high enough resolution for nfs_inode_attrs_need_update(). One problem is that a GETATTR can be launched within < 1 jiffy of the last operation that updated the attribute. Another problem is that RPC calls can take < 1 jiffy to execute. We can fix this by switching the variables to use a simple global counter that gets incremented every time we start another GETATTR call. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Changes to inode->i_nlinks must set the NFS_INO_INVALID_ATTR flagTrond Myklebust2008-10-151-0/+3
| | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: fix nfs_parse_ip_address() corner caseChuck Lever2008-10-101-11/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bruce observed that nfs_parse_ip_address() will successfully parse an IPv6 address that looks like this: "::1%" A scope delimiter is present, but there is no scope ID following it. This is harmless, as it would simply set the scope ID to zero. However, in some cases we would like to flag this as an improperly formed address. We are now also careful to reject addresses where garbage follows the address (up to the length of the string), instead of ignoring the non-address characters; and where the scope ID is nonsense (not a valid device name, but also not numeric). Before, both of these cases would result in a harmless zero scope ID. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Cleanup nfs_set_portJ. Bruce Fields2008-10-101-10/+9
| | | | | | | | | | Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Fix attribute updatesTrond Myklebust2008-10-091-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a regression seen when running the Connectathon testsuite against an ext3 filesystem. The reason was that the inode was constantly being marked as 'just updated' by the jiffy wraparound test. This again meant that newer GETATTR calls were failing to pass the nfs_inode_attrs_need_update() test unless the changes caused a ctime update on the server, since they were perceived as having been started before the latest inode update. Given that nfs_inode_attrs_need_update() already checks for wraparound of nfsi->last_updated, we can drop the buggy "protection" in nfs_update_inode(). Also make a slight micro-optimisation of nfs_inode_attrs_need_update(): we are more often going to see time_after(fattr->time_start, nfsi->last_updated) be true, rather than seeing an update of ctime/size, so put that test first to ensure that we optimise away the ctime/size tests. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Don't use range_cyclic for data integrity syncsTrond Myklebust2008-10-081-1/+2
| | | | | | | | | | | | | | It is more efficient to write linearly starting from the beginning of the file. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>