summaryrefslogtreecommitdiffstats
path: root/fs/nfsd/xdr4.h (follow)
Commit message (Collapse)AuthorAgeFilesLines
* nfsd: implement pNFS operationsChristoph Hellwig2015-02-021-0/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for the GETDEVICEINFO, LAYOUTGET, LAYOUTCOMMIT and LAYOUTRETURN NFSv4.1 operations, as well as backing code to manage outstanding layouts and devices. Layout management is very straight forward, with a nfs4_layout_stateid structure that extends nfs4_stid to manage layout stateids as the top-level structure. It is linked into the nfs4_file and nfs4_client structures like the other stateids, and contains a linked list of layouts that hang of the stateid. The actual layout operations are implemented in layout drivers that are not part of this commit, but will be added later. The worst part of this commit is the management of the pNFS device IDs, which suffers from a specification that is not sanely implementable due to the fact that the device-IDs are global and not bound to an export, and have a small enough size so that we can't store the fsid portion of a file handle, and must never be reused. As we still do need perform all export authentication and validation checks on a device ID passed to GETDEVICEINFO we are caught between a rock and a hard place. To work around this issue we add a new hash that maps from a 64-bit integer to a fsid so that we can look up the export to authenticate against it, a 32-bit integer as a generation that we can bump when changing the device, and a currently unused 32-bit integer that could be used in the future to handle more than a single device per export. Entries in this hash table are never deleted as we can't reuse the ids anyway, and would have a severe lifetime problem anyway as Linux export structures are temporary structures that can go away under load. Parts of the XDR data, structures and marshaling/unmarshaling code, as well as many concepts are derived from the old pNFS server implementation from Andy Adamson, Benny Halevy, Dean Hildebrand, Marc Eshel, Fred Isaman, Mike Sager, Ricardo Labiaga and many others. Signed-off-by: Christoph Hellwig <hch@lst.de>
* nfsd: Add DEALLOCATE supportAnna Schumaker2014-11-071-0/+1
| | | | | | | | DEALLOCATE only returns a status value, meaning we can use the noop() xdr encoder to reply to the client. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd: Add ALLOCATE supportAnna Schumaker2014-11-071-0/+8
| | | | | | | | | | | The ALLOCATE operation is used to preallocate space in a file. I can do this by using vfs_fallocate() to do the actual preallocation. ALLOCATE only returns a status indicator, so we don't need to write a special encode() function. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Implement SEEKAnna Schumaker2014-09-291-0/+14
| | | | | | | | This patch adds server support for the NFS v4.2 operation SEEK, which returns the position of the next hole or data segment in a file. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd: Add a mutex to protect the NFSv4.0 open owner replay cacheJeff Layton2014-07-311-1/+4
| | | | | | | | | | | | | | | We don't want to rely on the client_mutex for protection in the case of NFSv4 open owners. Instead, we add a mutex that will only be taken for NFSv4.0 state mutating operations, and that will be released once the entire compound is done. Also, ensure that nfsd4_cstate_assign_replay/nfsd4_cstate_clear_replay take a reference to the stateowner when they are using it for NFSv4.0 open and lock replay caching. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd: Allow struct nfsd4_compound_state to cache the nfs4_clientJeff Layton2014-07-101-0/+1
| | | | | | | | | | | | | | We want to use the nfsd4_compound_state to cache the nfs4_client in order to optimise away extra lookups of the clid. In the v4.0 case, we use this to ensure that we only have to look up the client at most once per compound for each call into lookup_clientid. For v4.1+ we set the pointer in the cstate during SEQUENCE processing so we should never need to do a search for it. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd: Cleanup nfs4svc_encode_compoundresTrond Myklebust2014-07-081-1/+1
| | | | | | | | | Move the slot return, put session etc into a helper in fs/nfsd/nfs4state.c instead of open coding in nfs4svc_encode_compoundres. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: replace defer_free by svcxdr_tmpallocJ. Bruce Fields2014-07-081-4/+9
| | | | | | | Avoid an extra allocation for the tmpbuf struct itself, and stop ignoring some allocation failures. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: remove unused defer_free argumentJ. Bruce Fields2014-07-081-1/+0
| | | | | | | 28e05dd8457c "knfsd: nfsd4: represent nfsv4 acl with array instead of linked list" removed the last user that wanted a custom free function. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: rename cr_linkname->cr_dataJ. Bruce Fields2014-07-081-4/+4
| | | | | | | The name of a link is currently stored in cr_name and cr_namelen, and the content in cr_linkname and cr_linklen. That's confusing. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: allow large readdirsJ. Bruce Fields2014-05-301-3/+2
| | | | | | | | Currently we limit readdir results to a single page. This can result in a performance regression compared to NFSv3 when reading large directories. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: more precise nfsd4_max_replyJ. Bruce Fields2014-05-301-0/+1
| | | | | | | | | | | It will turn out to be useful to have a more accurate estimate of reply size; so, piggyback on the existing op reply-size estimators. Also move nfsd4_max_reply to nfs4proc.c to get easier access to struct nfsd4_operation and friends. (Thanks to Christoph Hellwig for pointing out that simplification.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: convert 4.1 replay encodingJ. Bruce Fields2014-05-301-1/+1
| | | | | | | | | | | | Limits on maxresp_sz mean that we only ever need to replay rpc's that are contained entirely in the head. The one exception is very small zero-copy reads. That's an odd corner case as clients wouldn't normally ask those to be cached. in any case, this seems a little more robust. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: teach encoders to handle reserve_space failuresJ. Bruce Fields2014-05-301-1/+1
| | | | | | | | | | | | | | | | | | | We've tried to prevent running out of space with COMPOUND_SLACK_SPACE and special checking in those operations (getattr) whose result can vary enormously. However: - COMPOUND_SLACK_SPACE may be difficult to maintain as we add more protocol. - BUG_ON or page faulting on failure seems overly fragile. - Especially in the 4.1 case, we prefer not to fail compounds just because the returned result came *close* to session limits. (Though perfect enforcement here may be difficult.) - I'd prefer encoding to be uniform for all encoders instead of having special exceptions for encoders containing, for example, attributes. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: fix encoding of out-of-space repliesJ. Bruce Fields2014-05-271-0/+2
| | | | | | | | | | | If nfsd4_check_resp_size() returns an error then we should really be truncating the reply here, otherwise we may leave extra garbage at the end of the rpc reply. Also add a warning to catch any cases where our reply-size estimates may be wrong in the case of a non-idempotent operation. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: tweak nfsd4_encode_getattr to take xdr_streamJ. Bruce Fields2014-05-231-3/+4
| | | | | | | | Just change the nfsd4_encode_getattr api. Not changing any code or adding any new functionality yet. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: embed xdr_stream in nfsd4_compoundresJ. Bruce Fields2014-05-231-3/+1
| | | | | | | This is a mechanical transformation with no change in behavior. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: nfsd4_replay_cache_entry should be staticJ. Bruce Fields2014-03-291-2/+0
| | | | | | This isn't actually used anywhere else. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* Define op_iattr for nfsd4_open instead using macroKinglong Mee2014-01-061-2/+1
| | | | | Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd: using nfsd4_encode_noop for encoding destroy_session/free_stateidKinglong Mee2014-01-031-1/+0
| | | | | | | | Get rid of the extra code, using nfsd4_encode_noop for encoding destroy_session and free_stateid. And, delete unused argument (fr_status) int nfsd4_free_stateid. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Server implementation of MAC LabelingDavid Quigley2013-05-151-0/+4
| | | | | | | | | | | | | Implement labeled NFS on the server: encoding and decoding, and writing and reading, of file labels. Enabled with CONFIG_NFSD_V4_SECURITY_LABEL. Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com> Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg> Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg> Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: cleanup handling of nfsv4.0 closed stateid'sJ. Bruce Fields2013-04-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Closed stateid's are kept around a little while to handle close replays in the 4.0 case. So we stash them in the last-used stateid in the oo_last_closed_stateid field of the open owner. We can free that in encode_seqid_op_tail once the seqid on the open owner is next incremented. But we don't want to do that on the close itself; so we set NFS4_OO_PURGE_CLOSE flag set on the open owner, skip freeing it the first time through encode_seqid_op_tail, then when we see that flag set next time we free it. This is unnecessarily baroque. Instead, just move the logic that increments the seqid out of the xdr code and into the operation code itself. The justification given for the current placement is that we need to wait till the last minute to be sure we know whether the status is a sequence-id-mutating error or not, but examination of the code shows that can't actually happen. Reported-by: Yanchuan Nian <ycnian@gmail.com> Tested-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd: remove unused macro in nfsv4Yanchuan Nian2013-04-031-1/+0
| | | | | | | | lk_rflags is never used anywhere, and rflags is not defined in struct nfsd4_lock. Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: handle seqid-mutating open errors from xdr decodingJ. Bruce Fields2013-04-031-0/+1
| | | | | | | | | | If a client sets an owner (or group_owner or acl) attribute on open for create, and the mapping of that owner to an id fails, then we return BAD_OWNER. But BAD_OWNER is a seqid-mutating error, so we can't shortcut the open processing that case: we have to at least look up the owner so we can find the seqid to bump. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: simplify nfsd4_encode_fattr interface slightlyJ. Bruce Fields2013-01-241-1/+1
| | | | | | | | | | | It seems slightly simpler to make nfsd4_encode_fattr rather than its callers responsible for advancing the write pointer on success. (Also: the count == 0 check in the verify case looks superfluous. Running out of buffer space is really the only reason fattr encoding should fail with eresource.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: disable zero-copy on non-final read opsJ. Bruce Fields2012-12-171-0/+8
| | | | | | | To ensure ordering of read data with any following operations, turn off zero copy if the read is not the final operation in the compound. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: delay filling in write iovec array till after xdr decodingJ. Bruce Fields2012-11-261-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Our server rejects compounds containing more than one write operation. It's unclear whether this is really permitted by the spec; with 4.0, it's possibly OK, with 4.1 (which has clearer limits on compound parameters), it's probably not OK. No client that we're aware of has ever done this, but in theory it could be useful. The source of the limitation: we need an array of iovecs to pass to the write operation. In the worst case that array of iovecs could have hundreds of elements (the maximum rwsize divided by the page size), so it's too big to put on the stack, or in each compound op. So we instead keep a single such array in the compound argument. We fill in that array at the time we decode the xdr operation. But we decode every op in the compound before executing any of them. So once we've used that array we can't decode another write. If we instead delay filling in that array till the time we actually perform the write, we can reuse it. Another option might be to switch to decoding compound ops one at a time. I considered doing that, but it has a number of other side effects, and I'd rather fix just this one problem for now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: move more write parameters into xdr argumentJ. Bruce Fields2012-11-261-0/+2
| | | | | | In preparation for moving some of this elsewhere. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd: use service net instead of hard-coded init_netStanislav Kinsbursky2012-11-151-1/+1
| | | | | | | | This patch replaces init_net by SVC_NET(), where possible and also passes proper context to nested functions where required. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: implement backchannel_ctl operationJ. Bruce Fields2012-11-081-0/+2
| | | | | | This operation is mandatory for servers to implement. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: int/__be32 fixesJ. Bruce Fields2012-06-011-3/+3
| | | | | | | In each of these cases there's a simple unambiguous correct choice, and no actual bug. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Fix nfs4_verifier memory alignmentChuck Lever2012-03-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Clean up due to code review. The nfs4_verifier's data field is not guaranteed to be u32-aligned. Casting an array of chars to a u32 * is considered generally hazardous. We can fix most of this by using a __be32 array to generate the verifier's contents and then byte-copying it into the verifier field. However, there is one spot where there is a backwards compatibility constraint: the do_nfsd_create() call expects a verifier which is 32-bit aligned. Fix this spot by forcing the alignment of the create verifier in the nfsd4_open args structure. Also, sizeof(nfs4_verifer) is the size of the in-core verifier data structure, but NFS4_VERIFIER_SIZE is the number of octets in an XDR'd verifier. The two are not interchangeable, even if they happen to have the same value. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd41: implement NFS4_SHARE_WANT_NO_DELEG, NFS4_OPEN_DELEGATE_NONE_EXT, ↵Benny Halevy2012-02-181-0/+1
| | | | | | | | | | | | why_no_deleg Respect client request for not getting a delegation in NFSv4.1 Appropriately return delegation "type" NFS4_OPEN_DELEGATE_NONE_EXT and WND4_NOT_WANTED reason. [nfsd41: add missing break when encoding op_why_no_deleg] Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Clean up the test_stateid functionBryan Schumaker2012-02-181-2/+7
| | | | | | | | | | | When I initially wrote it, I didn't understand how lists worked so I wrote something that didn't use them. I think making a list of stateids to test is a more straightforward implementation, especially compared to especially compared to decoding stateids while simultaneously encoding a reply to the client. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd41: split out share_access want and signal flags while decodingBenny Halevy2012-02-181-2/+4
| | | | | Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd41: use current stateid by valueTigran Mkrtchyan2012-02-151-2/+11
| | | | | Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd41: save and restore current stateid with current fhTigran Mkrtchyan2012-02-151-0/+1
| | | | | Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd41: handle current stateid in open and closeTigran Mkrtchyan2012-02-151-0/+1
| | | | | Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: rearrange struct nfsd4_slotJ. Bruce Fields2012-02-141-1/+2
| | | | | | | | | | | | | Combine two booleans into a single flag field, move the smaller fields to the end. (In practice this doesn't make the struct any smaller. But we'll be adding another flag here soon.) Remove some debugging code that doesn't look useful, while we're in the neighborhood. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfs41: implement DESTROY_CLIENTID operationMi Jinlong2011-10-241-0/+5
| | | | | | | According to rfc5661 18.50, implement DESTROY_CLIENTID operation. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: warn on open failure after createJ. Bruce Fields2011-10-171-1/+2
| | | | | | | | | | If we create the object and then return failure to the client, we're left with an unexpected file in the filesystem. I'm trying to eliminate such cases but not 100% sure I have so an assertion might be helpful for now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: preallocate open stateid in process_open1()J. Bruce Fields2011-10-171-0/+1
| | | | | | | As with the nfs4_file, we'd prefer to find out about any failure before creating a new file rather than after. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: preallocate nfs4_file in process_open1()J. Bruce Fields2011-10-171-0/+1
| | | | | | | | | | | | | | | | Creating a new file is an irrevocable step--once it's visible in the filesystem, other processes may have seen it and done something with it, and unlinking it wouldn't simply undo the effects of the create. Therefore, in the case where OPEN creates a new file, we shouldn't do the create until we know that the rest of the OPEN processing will succeed. For example, we should preallocate a struct file in case we need it until waiting to allocate it till process_open2(), which is already too late. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: clean up open owners on OPEN failureJ. Bruce Fields2011-10-171-0/+1
| | | | | | | | | | | | | | | | | If process_open1() creates a new open owner, but the open later fails, the current code will leave the open owner around. It won't be on the close_lru list, and the client isn't expected to send a CLOSE, so it will hang around as long as the client does. Similarly, if process_open1() removes an existing open owner from the close lru, anticipating that an open owner that previously had no associated stateid's now will, but the open subsequently fails, then we'll again be left with the same leak. Fix both problems. Reported-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: move name-length checks to xdrJ. Bruce Fields2011-10-111-2/+1
| | | | | | Again, these checks are better in the xdr code. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: assume test_stateid always has sessionJ. Bruce Fields2011-09-261-1/+0
| | | | | | | | Test_stateid is 4.1-only and only allowed after a sequence operation, so this check is unnecessary. Cc: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd41: try to check reply size before operationMi Jinlong2011-09-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | For checking the size of reply before calling a operation, we need try to get maxsize of the operation's reply. v3: using new method as Bruce said, "we could handle operations in two different ways: - For operations that actually change something (write, rename, open, close, ...), do it the way we're doing it now: be very careful to estimate the size of the response before even processing the operation. - For operations that don't change anything (read, getattr, ...) just go ahead and do the operation. If you realize after the fact that the response is too large, then return the error at that point. So we'd add another flag to op_flags: say, OP_MODIFIES_SOMETHING. And for operations with OP_MODIFIES_SOMETHING set, we'd do the first thing. For operations without it set, we'd do the second." Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> [bfields@redhat.com: crash, don't attempt to handle, undefined op_rsize_bop] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: split stateowners into open and lockownersJ. Bruce Fields2011-09-071-1/+1
| | | | | | | | The stateowner has some fields that only make sense for openowners, and some that only make sense for lockowners, and I find it a lot clearer if those are separated out. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: eliminate unused lt_stateownerJ. Bruce Fields2011-09-011-1/+0
| | | | | | This is used only as a local variable. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* nfsd4: drop most stateowner refcountingJ. Bruce Fields2011-09-011-1/+1
| | | | | | | Maybe we'll bring it back some day, but we don't have much real use for it now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>