summaryrefslogtreecommitdiffstats
path: root/fs (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | | | | | eCryptfs: Accept one offset parameter in page offset crypto functionsTyler Hicks2013-06-081-26/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no longer a need to accept different offset values for the source and destination pages when encrypting/decrypting an extent in an eCryptfs page. The two offsets can be collapsed into a single parameter. Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
| * | | | | | | | eCryptfs: Simplify lower file offset calculationTyler Hicks2013-06-081-11/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that lower filesystem IO operations occur for complete PAGE_CACHE_SIZE bytes, the calculation for converting an eCryptfs extent index into a lower file offset can be simplified. Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
| * | | | | | | | eCryptfs: Read/write entire page during page IOTyler Hicks2013-06-081-35/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading and writing encrypted pages, perform IO using the entire page all at once rather than 4096 bytes at a time. This only affects architectures where PAGE_CACHE_SIZE is larger than 4096 bytes. Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
| * | | | | | | | eCryptfs: Use entire helper page during page crypto operationsTyler Hicks2013-06-081-13/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When encrypting eCryptfs pages and decrypting pages from the lower filesystem, utilize the entire helper page rather than only the first 4096 bytes. This only affects architectures where PAGE_CACHE_SIZE is larger than 4096 bytes. Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
| * | | | | | | | eCryptfs: Cocci spatch "memdup.spatch"Thomas Meyer2013-06-081-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
* | | | | | | | | Merge tag 'jfs-3.11' of git://github.com/kleikamp/linux-shaggyLinus Torvalds2013-07-1112-154/+128
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull jfs update from Dave Kleikamp: "A couple cleanups to JFS for 3.11" * tag 'jfs-3.11' of git://github.com/kleikamp/linux-shaggy: jfs: Update jfs_error jfs: fix sparse warning in fs/jfs/xattr.c
| * | | | | | | | | jfs: Update jfs_errorJoe Perches2013-06-0512-152/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use a more current logging style. Add __printf format and argument verification. Remove embedded function names from formats. Add %pf, __builtin_return_address(0) to jfs_error. Add newlines to formats for kernel style consistency. (One format already had an erroneous newline) Coalesce formats and align arguments. Object size reduced ~1KiB. $ size fs/jfs/built-in.o* text data bss dec hex filename 201891 35488 63936 301315 49903 fs/jfs/built-in.o.new 202821 35488 64192 302501 49da5 fs/jfs/built-in.o.old Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
| * | | | | | | | | jfs: fix sparse warning in fs/jfs/xattr.cDave Kleikamp2013-06-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CHECK fs/jfs/xattr.c fs/jfs/xattr.c:1092:5: warning: symbol 'jfs_initxattrs' was not declared. Should it be static? Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
* | | | | | | | | | Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2013-07-1110-103/+413
|\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd changes from Bruce Fields: "Changes this time include: - 4.1 enabled on the server by default: the last 4.1-specific issues I know of are fixed, so we're not going to find the rest of the bugs without more exposure. - Experimental support for NFSv4.2 MAC Labeling (to allow running selinux over NFS), from Dave Quigley. - Fixes for some delicate cache/upcall races that could cause rare server hangs; thanks to Neil Brown and Bodo Stroesser for extreme debugging persistence. - Fixes for some bugs found at the recent NFS bakeathon, mostly v4 and v4.1-specific, but also a generic bug handling fragmented rpc calls" * 'for-3.11' of git://linux-nfs.org/~bfields/linux: (31 commits) nfsd4: support minorversion 1 by default nfsd4: allow destroy_session over destroyed session svcrpc: fix failures to handle -1 uid's sunrpc: Don't schedule an upcall on a replaced cache entry. net/sunrpc: xpt_auth_cache should be ignored when expired. sunrpc/cache: ensure items removed from cache do not have pending upcalls. sunrpc/cache: use cache_fresh_unlocked consistently and correctly. sunrpc/cache: remove races with queuing an upcall. nfsd4: return delegation immediately if lease fails nfsd4: do not throw away 4.1 lock state on last unlock nfsd4: delegation-based open reclaims should bypass permissions svcrpc: don't error out on small tcp fragment svcrpc: fix handling of too-short rpc's nfsd4: minor read_buf cleanup nfsd4: fix decoding of compounds across page boundaries nfsd4: clean up nfs4_open_delegation NFSD: Don't give out read delegations on creates nfsd4: allow client to send no cb_sec flavors nfsd4: fail attempts to request gss on the backchannel nfsd4: implement minimal SP4_MACH_CRED ...
| * | | | | | | | | | nfsd4: support minorversion 1 by defaultJ. Bruce Fields2013-07-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We now have minimal minorversion 1 support; turn it on by default. This can still be turned off with "echo -4.1 >/proc/fs/nfsd/versions". Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd4: allow destroy_session over destroyed sessionJ. Bruce Fields2013-07-092-10/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RFC 5661 allows a client to destroy a session using a compound associated with the destroyed session, as long as the DESTROY_SESSION op is the last op of the compound. We attempt to allow this, but testing against a Solaris client (which does destroy sessions in this way) showed that we were failing the DESTROY_SESSION with NFS4ERR_DELAY, because we assumed the reference count on the session (held by us) represented another rpc in progress over this session. Fix this by noting that in this case the expected reference count is 1, not 0. Also, note as long as the session holds a reference to the compound we're destroying, we can't free it here--instead, delay the free till the final put in nfs4svc_encode_compoundres. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd4: return delegation immediately if lease failsJ. Bruce Fields2013-07-011-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This case shouldn't happen--the administrator shouldn't really allow other applications access to the export until clients have had the chance to reclaim their state--but if it does then we should set the "return this lease immediately" bit on the reply. That still leaves some small races, but it's the best the protocol allows us to do in the case a lease is ripped out from under us.... Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd4: do not throw away 4.1 lock state on last unlockJ. Bruce Fields2013-07-011-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit eb2099f31b0f090684a64ef8df44a30ff7c45fc2 "nfsd4: release lockowners on last unlock in 4.1 case". Trond identified language in rfc 5661 section 8.2.4 which forbids this behavior: Stateids associated with byte-range locks are an exception. They remain valid even if a LOCKU frees all remaining locks, so long as the open file with which they are associated remains open, unless the client frees the stateids via the FREE_STATEID operation. And bakeathon 2013 testing found a 4.1 freebsd client was getting an incorrect BAD_STATEID return from a FREE_STATEID in the above situation and then failing. The spec language honestly was probably a mistake but at this point with implementations already following it we're probably stuck with that. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd4: delegation-based open reclaims should bypass permissionsJ. Bruce Fields2013-07-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We saw a v4.0 client's create fail as follows: - open create succeeds and gets a read delegation - client attempts to set mode on new file, gets DELAY while server recalls delegation. - client attempts a CLAIM_DELEGATE_CUR open using the delegation, gets error because of new file mode. This probably can't happen on a recent kernel since we're no longer giving out delegations on create opens. Nevertheless, it's a bug--reclaim opens should bypass permission checks. Reported-by: Steve Dickson <steved@redhat.com> Reported-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd4: minor read_buf cleanupJ. Bruce Fields2013-07-011-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code to step to the next page seems reasonably self-contained. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd4: fix decoding of compounds across page boundariesJ. Bruce Fields2013-07-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A freebsd NFSv4.0 client was getting rare IO errors expanding a tarball. A network trace showed the server returning BAD_XDR on the final getattr of a getattr+write+getattr compound. The final getattr started on a page boundary. I believe the Linux client ignores errors on the post-write getattr, and that that's why we haven't seen this before. Cc: stable@vger.kernel.org Reported-by: Rick Macklem <rmacklem@uoguelph.ca> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd4: clean up nfs4_open_delegationJ. Bruce Fields2013-07-011-44/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nfs4_open_delegation logic is unecessarily baroque. Also stop pretending we support write delegations in several places. Some day we will support write delegations, but when that happens adding back in these flag parameters will be the easy part. For now they're just confusing. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | NFSD: Don't give out read delegations on createsSteve Dickson2013-07-011-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an exclusive create is done with the mode bits set (aka open(testfile, O_CREAT | O_EXCL, 0777)) this causes a OPEN op followed by a SETATTR op. When a read delegation is given in the OPEN, it causes the SETATTR to delay with EAGAIN until the delegation is recalled. This patch caused exclusive creates to give out a write delegation (which turn into no delegation) which allows the SETATTR seamlessly succeed. Signed-off-by: Steve Dickson <steved@redhat.com> [bfields: do this for any CREATE, not just exclusive; comment] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd4: allow client to send no cb_sec flavorsJ. Bruce Fields2013-07-011-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In testing I notice that some of the pynfs tests forget to send any cb_sec flavors, and that we haven't necessarily errored out in that case before. I'll fix pynfs, but am also inclined to default to trying AUTH_NONE in that case in case this is something clients actually do. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd4: fail attempts to request gss on the backchannelJ. Bruce Fields2013-07-011-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't support gss on the backchannel. We should state that fact up front rather than just letting things continue and later making the client try to figure out why the backchannel isn't working. Trond suggested instead returning NFS4ERR_NOENT. I think it would be tricky for the client to distinguish between the case "I don't support gss on the backchannel" and "I can't find that in my cache, please create another context and try that instead", and I'd prefer something that currently doesn't have any other meaning for this operation, hence the (somewhat arbitrary) NFS4ERR_ENCR_ALG_UNSUPP. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd4: implement minimal SP4_MACH_CREDJ. Bruce Fields2013-07-013-12/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do a minimal SP4_MACH_CRED implementation suggested by Trond, ignoring the client-provided spo_must_* arrays and just enforcing credential checks for the minimum required operations. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | svcrpc: store gss mech in svc_credJ. Bruce Fields2013-07-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Store a pointer to the gss mechanism used in the rq_cred and cl_cred. This will make it easier to enforce SP4_MACH_CRED, which needs to compare the mechanism used on the exchange_id with that used on protected operations. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd: avoid undefined signed overflowJim Rees2013-05-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In C, signed integer overflow results in undefined behavior, but unsigned overflow wraps around. So do the subtraction first, then cast to signed. Reported-by: Joakim Tjernlund <joakim.tjernlund@transmode.se> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd4: fix compile in !CONFIG_NFSD_V4_SECURITY_LABEL caseJ. Bruce Fields2013-05-151-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | NFSD: Server implementation of MAC LabelingDavid Quigley2013-05-157-10/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement labeled NFS on the server: encoding and decoding, and writing and reading, of file labels. Enabled with CONFIG_NFSD_V4_SECURITY_LABEL. Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com> Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg> Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg> Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | NFSDv4.2: Add NFS v4.2 support to the NFS serverSteve Dickson2013-05-132-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This enables NFSv4.2 support for the server. To enable this code do the following: echo "+4.2" >/proc/fs/nfsd/versions after the nfsd kernel module is loaded. On its own this does nothing except allow the server to respond to compounds with minorversion set to 2. All the new NFSv4.2 features are optional, so this is perfectly legal. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd4: store correct client minorversion for >=4.2J. Bruce Fields2013-05-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code assumes that any client using exchange_id is using NFSv4.1, but with the introduction of 4.2 that will no longer true. This main effect of this is that client callbacks will use the same minorversion as that used on the exchange_id. Note that clients are forbidden from mixing 4.1 and 4.2 compounds. (See rfc 5661, section 2.7, #13: "A client MUST NOT attempt to use a stateid, filehandle, or similar returned object from the COMPOUND procedure with minor version X for another COMPOUND procedure with minor version Y, where X != Y.") However, we do not currently attempt to enforce this except in the case of mixing zero minor version with non-zero minor versions. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | | | | | | | | nfsd: get rid of the unused functions in vfsZhao Hongjiang2013-05-131-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fh_lock_parent(), nfsd_truncate(), nfsd_notify_change() and nfsd_sync_dir() fuctions are neither implemented nor used, just remove them. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | | | | | | | | | | mm: remove free_area_cacheMichel Lespinasse2013-07-112-4/+0
| |_|_|_|_|_|_|_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since all architectures have been converted to use vm_unmapped_area(), there is no remaining use for the free_area_cache. Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2013-07-101-5/+57
|\ \ \ \ \ \ \ \ \ \ | |_|_|_|_|/ / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: "This is a re-do of the net-next pull request for the current merge window. The only difference from the one I made the other day is that this has Eliezer's interface renames and the timeout handling changes made based upon your feedback, as well as a few bug fixes that have trickeled in. Highlights: 1) Low latency device polling, eliminating the cost of interrupt handling and context switches. Allows direct polling of a network device from socket operations, such as recvmsg() and poll(). Currently ixgbe, mlx4, and bnx2x support this feature. Full high level description, performance numbers, and design in commit 0a4db187a999 ("Merge branch 'll_poll'") From Eliezer Tamir. 2) With the routing cache removed, ip_check_mc_rcu() gets exercised more than ever before in the case where we have lots of multicast addresses. Use a hash table instead of a simple linked list, from Eric Dumazet. 3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski, Marek Puzyniak, Michal Kazior, and Sujith Manoharan. 4) Support reporting the TUN device persist flag to userspace, from Pavel Emelyanov. 5) Allow controlling network device VF link state using netlink, from Rony Efraim. 6) Support GRE tunneling in openvswitch, from Pravin B Shelar. 7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from Daniel Borkmann and Eric Dumazet. 8) Allow controlling of TCP quickack behavior on a per-route basis, from Cong Wang. 9) Several bug fixes and improvements to vxlan from Stephen Hemminger, Pravin B Shelar, and Mike Rapoport. In particular, support receiving on multiple UDP ports. 10) Major cleanups, particular in the area of debugging and cookie lifetime handline, to the SCTP protocol code. From Daniel Borkmann. 11) Allow packets to cross network namespaces when traversing tunnel devices. From Nicolas Dichtel. 12) Allow monitoring netlink traffic via AF_PACKET sockets, in a manner akin to how we monitor real network traffic via ptype_all. From Daniel Borkmann. 13) Several bug fixes and improvements for the new alx device driver, from Johannes Berg. 14) Fix scalability issues in the netem packet scheduler's time queue, by using an rbtree. From Eric Dumazet. 15) Several bug fixes in TCP loss recovery handling, from Yuchung Cheng. 16) Add support for GSO segmentation of MPLS packets, from Simon Horman. 17) Make network notifiers have a real data type for the opaque pointer that's passed into them. Use this to properly handle network device flag changes in arp_netdev_event(). From Jiri Pirko and Timo Teräs. 18) Convert several drivers over to module_pci_driver(), from Peter Huewe. 19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a O(1) calculation instead. From Eric Dumazet. 20) Support setting of explicit tunnel peer addresses in ipv6, just like ipv4. From Nicolas Dichtel. 21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet. 22) Prevent a single high rate flow from overruning an individual cpu during RX packet processing via selective flow shedding. From Willem de Bruijn. 23) Don't use spinlocks in TCP md5 signing fast paths, from Eric Dumazet. 24) Don't just drop GSO packets which are above the TBF scheduler's burst limit, chop them up so they are in-bounds instead. Also from Eric Dumazet. 25) VLAN offloads are missed when configured on top of a bridge, fix from Vlad Yasevich. 26) Support IPV6 in ping sockets. From Lorenzo Colitti. 27) Receive flow steering targets should be updated at poll() time too, from David Majnemer. 28) Fix several corner case regressions in PMTU/redirect handling due to the routing cache removal, from Timo Teräs. 29) We have to be mindful of ipv4 mapped ipv6 sockets in upd_v6_push_pending_frames(). From Hannes Frederic Sowa. 30) Fix L2TP sequence number handling bugs, from James Chapman." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits) drivers/net: caif: fix wrong rtnl_is_locked() usage drivers/net: enic: release rtnl_lock on error-path vhost-net: fix use-after-free in vhost_net_flush net: mv643xx_eth: do not use port number as platform device id net: sctp: confirm route during forward progress virtio_net: fix race in RX VQ processing virtio: support unlocked queue poll net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit Documentation: Fix references to defunct linux-net@vger.kernel.org net/fs: change busy poll time accounting net: rename low latency sockets functions to busy poll bridge: fix some kernel warning in multicast timer sfc: Fix memory leak when discarding scattered packets sit: fix tunnel update via netlink dt:net:stmmac: Add dt specific phy reset callback support. dt:net:stmmac: Add support to dwmac version 3.610 and 3.710 dt:net:stmmac: Allocate platform data only if its NULL. net:stmmac: fix memleak in the open method ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available net: ipv6: fix wrong ping_v6_sendmsg return value ...
| * | | | | | | | | net/fs: change busy poll time accountingEliezer Tamir2013-07-091-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Suggested by Linus: Changed time accounting for busy-poll: - Make it microsecond based. - Use unsigned longs. - Revert back to use time_after instead of time_in_range. Reorder poll/select busy loop conditions: - Clear busy_flag after one time we can't busy-poll. - Only init busy_end if we actually are going to busy-poll. Added one more missing need_resched() test. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | net: rename low latency sockets functions to busy pollEliezer Tamir2013-07-091-23/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename functions in include/net/ll_poll.h to busy wait. Clarify documentation about expected power use increase. Rename POLL_LL to POLL_BUSY_LOOP. Add need_resched() testing to poll/select busy loops. Note, that in select and poll can_busy_poll is dynamic and is updated continuously to reflect the existence of supported sockets with valid queue information. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-07-036-47/+97
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/freescale/fec_main.c drivers/net/ethernet/renesas/sh_eth.c net/ipv4/gre.c The GRE conflict is between a bug fix (kfree_skb --> kfree_skb_list) and the splitting of the gre.c code into seperate files. The FEC conflict was two sets of changes adding ethtool support code in an "!CONFIG_M5272" CPP protected block. Finally the sh_eth.c conflict was between one commit add bits set in the .eesr_err_check mask whilst another commit removed the .tx_error_check member and assignments. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | net: convert lls to use time_in_range()Eliezer Tamir2013-07-031-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Time in range will fail safely if we move to a different cpu with an extremely large clock skew. Add time_in_range64() and convert lls to use it. changelog: v2 - fixed double call to sched_clock in can_poll_ll - fixed checkpatchisms Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | net: avoid calling sched_clock when LLS is offEliezer Tamir2013-07-011-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change Low Latency Sockets code for select and poll so that when LLS is disabled sched_clock() is never called. Also, avoid sending POLL_LL to sockets if disabled. Reported-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | net: poll/select low latency socket supportEliezer Tamir2013-06-261-5/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | select/poll busy-poll support. Split sysctl value into two separate ones, one for read and one for poll. updated Documentation/sysctl/net.txt Add a new poll flag POLL_LL. When this flag is set, sock_poll will call sk_poll_ll if possible. sock_poll sets this flag in its return value to indicate to select/poll when a socket that can busy poll is found. When poll/select have nothing to report, call the low-level sock_poll again until we are out of time or we find something. Once the system call finds something, it stops setting POLL_LL, so it can return the result to the user ASAP. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | | | | Merge branch 'akpm' (updates from Andrew Morton)Linus Torvalds2013-07-098-89/+119
|\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge second patch-bomb from Andrew Morton: - misc fixes - audit stuff - fanotify/inotify/dnotify things - most of the rest of MM. The new cache shrinker code from Glauber and Dave Chinner probably isn't quite stabilized yet. - ptrace - ipc - partitions - reboot cleanups - add LZ4 decompressor, use it for kernel compression * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits) lib/scatterlist: error handling in __sg_alloc_table() scsi_debug: fix do_device_access() with wrap around range crypto: talitos: use sg_pcopy_to_buffer() lib/scatterlist: introduce sg_pcopy_from_buffer() and sg_pcopy_to_buffer() lib/scatterlist: factor out sg_miter_get_next_page() from sg_miter_next() crypto: add lz4 Cryptographic API lib: add lz4 compressor module arm: add support for LZ4-compressed kernel lib: add support for LZ4-compressed kernel decompressor: add LZ4 decompressor module lib: add weak clz/ctz functions reboot: move arch/x86 reboot= handling to generic kernel reboot: arm: change reboot_mode to use enum reboot_mode reboot: arm: prepare reboot_mode for moving to generic kernel code reboot: arm: remove unused restart_mode fields from some arm subarchs reboot: unicore32: prepare reboot_mode for moving to generic kernel code reboot: x86: prepare reboot_mode for moving to generic kernel code reboot: checkpatch.pl the new kernel/reboot.c file reboot: move shutdown/reboot related functions to kernel/reboot.c reboot: remove -stable friendly PF_THREAD_BOUND define ...
| * | | | | | | | | | | fatfs: add FAT_IOCTL_GET_VOLUME_IDMike Lockwood2013-07-093-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch, originally from Android kernel, adds vfat ioctl command FAT_IOCTL_GET_VOLUME_ID, with this command we can get the vfat volume ID using following code: ioctl(fd, FAT_IOCTL_GET_VOLUME_ID, &volume_ID) This patch is a modified version of the patch by Mike Lockwood, with changes from Dmitry Pervushin, who noticed the original patch makes some volume IDs abiguous with error returns: for example, if volume id is 0xFFFFFDAD, that matches -ENOIOCTLCMD, we get "FFFFFFFF" from the user space. So add a parameter to ioctl to get the correct volume ID. Android uses vfat volume ID to identify different sd card, when a new sd card is inserted to device, android can scan the media on it and pop up new contents. Signed-off-by: Bintian Wang <bintian.wang@linaro.org> Cc: dmitry pervushin <dpervushin@gmail.com> Cc: Mike Lockwood <lockwood@android.com> Cc: Colin Cross <ccross@android.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: John Stultz <john.stultz@linaro.org> Cc: Sean McNeil <sean@mcneil.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | | | ncpfs: fix error return code in ncp_parse_options()Wei Yongjun2013-07-091-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix to return -EINVAL from the option parse error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | | | mm/writeback: don't check force_wait to handle bdi->work_listWanpeng Li2013-07-091-8/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit 839a8e8660b6 ("writeback: replace custom worker pool implementation with unbound workqueue"), bdi_writeback_workfn runs off bdi_writeback->dwork, on each execution, it processes bdi->work_list and reschedules if there are more things to do instead of flush any work that race with us existing. It is unecessary to check force_wait in wb_do_writeback since it is always 0 after the mentioned commit. This patch remove the force_wait in wb_do_writeback. Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Reviewed-by: Tejun Heo <tj@kernel.org> Reviewed-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | | | fs/fs-writeback.c: : make wb_do_writeback() as staticHaicheng Li2013-07-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's not used globally and could be static. Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | | | fsnotify: update comments concerning locking schemeLino Sanfilippo2013-07-091-28/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There have been changes in the locking scheme of fsnotify but the comments in the source code have not been updated yet. This patch corrects this. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | | | inotify: fix race when adding a new watchLino Sanfilippo2013-07-091-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In inotify_new_watch() the number of watches for a group is compared against the max number of allowed watches and increased afterwards. The check and incrementation is not done atomically, so it is possible for multiple concurrent threads to pass the check and increment the number of marks above the allowed max. This patch uses an inotify groups mark_lock to ensure that both check and incrementation are done atomic. Furthermore we dont have to worry about the race that allows a concurrent thread to add a watch just after inotify_update_existing_watch() returned with -ENOENT anymore, since this is also synchronized by the groups mark mutex now. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | | | dnotify: replace dnotify_mark_mutex with mark mutex of dnotify_groupLino Sanfilippo2013-07-091-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need to use a special mutex to protect against the fcntl/close race (see dnotify.c for a description of this race). Instead the dnotify_groups mark mutex can be used. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | | | fanotify: put duplicate code for adding vfsmount/inode marks into an own ↵Lino Sanfilippo2013-07-091-36/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | function The code under the groups mark_mutex in fanotify_add_inode_mark() and fanotify_add_vfsmount_mark() is almost identical. So put it into a seperate function. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | | | fanotify: fix races when adding/removing marksLino Sanfilippo2013-07-091-12/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For both adding an event to an existing mark and destroying a mark we first have to find it via fsnotify_find_[inode|vfsmount]_mark(). But getting the mark and adding an event (or destroying it) is not done atomically. This opens a race where a thread is about to destroy a mark while another thread still finds the same mark and adds an event to its mask although it will be destroyed. Another race exists concerning the excess of a groups number of marks limit: When a mark is added the number of group marks is checked against the max number of marks per group and increased afterwards. Since check and increment is also not done atomically, this may result in 2 or more processes passing the check at the same time and increasing the number of group marks above the allowed limit. With this patch both races are avoided by doing the concerning operations with the groups mark mutex locked. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | | | fanotify: info leak in copy_event_to_user()Dan Carpenter2013-07-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ->reserved field isn't cleared so we leak one byte of stack information to userspace. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2013-07-0910-124/+149
|\ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "There is some follow-on RBD cleanup after the last window's code drop, a series from Yan fixing multi-mds behavior in cephfs, and then a sprinkling of bug fixes all around. Some warnings, sleeping while atomic, a null dereference, and cleanups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (36 commits) libceph: fix invalid unsigned->signed conversion for timespec encoding libceph: call r_unsafe_callback when unsafe reply is received ceph: fix race between cap issue and revoke ceph: fix cap revoke race ceph: fix pending vmtruncate race ceph: avoid accessing invalid memory libceph: Fix NULL pointer dereference in auth client code ceph: Reconstruct the func ceph_reserve_caps. ceph: Free mdsc if alloc mdsc->mdsmap failed. ceph: remove sb_start/end_write in ceph_aio_write. ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL. ceph: fix sleeping function called from invalid context. ceph: move inode to proper flushing list when auth MDS changes rbd: fix a couple warnings ceph: clear migrate seq when MDS restarts ceph: check migrate seq before changing auth cap ceph: fix race between page writeback and truncate ceph: reset iov_len when discarding cap release messages ceph: fix cap release race libceph: fix truncate size calculation ...
| * | | | | | | | | | | | ceph: fix race between cap issue and revokeYan, Zheng2013-07-041-9/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we receive new caps from the auth MDS and the non-auth MDS is revoking the newly issued caps, we should release the caps from the non-auth MDS. The scenario is filelock's state changes from SYNC to LOCK. Non-auth MDS revokes Fc cap, the client gets Fc cap from the auth MDS at the same time. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
| * | | | | | | | | | | | ceph: fix cap revoke raceYan, Zheng2013-07-041-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If caps are been revoking by the auth MDS, don't consider them as issued even they are still issued by non-auth MDS. The non-auth MDS should also be revoking/exporting these caps, the client just hasn't received the cap revoke/export message. The race I encountered is: When caps are exporting to new MDS, the client receives cap import message and cap revoke message from the new MDS, then receives cap export message from the old MDS. When the client receives cap revoke message from the new MDS, the revoking caps are still issued by the old MDS, so the client does nothing. Later when the cap export message is received, the client removes the caps issued by the old MDS. (Another way to fix the race is calling ceph_check_caps() in handle_cap_export()) Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>