linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge tag 'afs-next-20171113' of ↵	Linus Torvalds	2017-11-16	47	-5324/+6769
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS updates from David Howells: "kAFS filesystem driver overhaul. The major points of the overhaul are: (1) Preliminary groundwork is laid for supporting network-namespacing of kAFS. The remainder of the namespacing work requires some way to pass namespace information to submounts triggered by an automount. This requires something like the mount overhaul that's in progress. (2) sockaddr_rxrpc is used in preference to in_addr for holding addresses internally and add support for talking to the YFS VL server. With this, kAFS can do everything over IPv6 as well as IPv4 if it's talking to servers that support it. (3) Callback handling is overhauled to be generally passive rather than active. 'Callbacks' are promises by the server to tell us about data and metadata changes. Callbacks are now checked when we next touch an inode rather than actively going and looking for it where possible. (4) File access permit caching is overhauled to store the caching information per-inode rather than per-directory, shared over subordinate files. Whilst older AFS servers only allow ACLs on directories (shared to the files in that directory), newer AFS servers break that restriction. To improve memory usage and to make it easier to do mass-key removal, permit combinations are cached and shared. (5) Cell database management is overhauled to allow lighter locks to be used and to make cell records autonomous state machines that look after getting their own DNS records and cleaning themselves up, in particular preventing races in acquiring and relinquishing the fscache token for the cell. (6) Volume caching is overhauled. The afs_vlocation record is got rid of to simplify things and the superblock is now keyed on the cell and the numeric volume ID only. The volume record is tied to a superblock and normal superblock management is used to mediate the lifetime of the volume fscache token. (7) File server record caching is overhauled to make server records independent of cells and volumes. A server can be in multiple cells (in such a case, the administrator must make sure that the VL services for all cells correctly reflect the volumes shared between those cells). Server records are now indexed using the UUID of the server rather than the address since a server can have multiple addresses. (8) File server rotation is overhauled to handle VMOVED, VBUSY (and similar), VOFFLINE and VNOVOL indications and to handle rotation both of servers and addresses of those servers. The rotation will also wait and retry if the server says it is busy. (9) Data writeback is overhauled. Each inode no longer stores a list of modified sections tagged with the key that authorised it in favour of noting the modified region of a page in page->private and storing a list of keys that made modifications in the inode. This simplifies things and allows other keys to be used to actually write to the server if a key that made a modification becomes useless. (10) Writable mmap() is implemented. This allows a kernel to be build entirely on AFS. Note that Pre AFS-3.4 servers are no longer supported, though this can be added back if necessary (AFS-3.4 was released in 1998)" * tag 'afs-next-20171113' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (35 commits) afs: Protect call->state changes against signals afs: Trace page dirty/clean afs: Implement shared-writeable mmap afs: Get rid of the afs_writeback record afs: Introduce a file-private data record afs: Use a dynamic port if 7001 is in use afs: Fix directory read/modify race afs: Trace the sending of pages afs: Trace the initiation and completion of client calls afs: Fix documentation on # vs % prefix in mount source specification afs: Fix total-length calculation for multiple-page send afs: Only progress call state at end of Tx phase from rxrpc callback afs: Make use of the YFS service upgrade to fully support IPv6 afs: Overhaul volume and server record caching and fileserver rotation afs: Move server rotation code into its own file afs: Add an address list concept afs: Overhaul cell database management afs: Overhaul permit caching afs: Overhaul the callback handling afs: Rename struct afs_call server member to cm_server ...
\| *	afs: Protect call->state changes against signals	David Howells	2017-11-13	4	-69/+150
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Protect call->state changes against the call being prematurely terminated due to a signal. What can happen is that a signal causes afs_wait_for_call_to_complete() to abort an afs_call because it's not yet complete whilst afs_deliver_to_call() is delivering data to that call. If the data delivery causes the state to change, this may overwrite the state of the afs_call, making it not-yet-complete again - but no further notifications will be forthcoming from AF_RXRPC as the rxrpc call has been aborted and completed, so kAFS will just hang in various places waiting for that call or on page bits that need clearing by that call. A tracepoint to monitor call state changes is also provided. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Trace page dirty/clean	David Howells	2017-11-13	3	-13/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a trace event that logs the dirtying and cleaning of pages attached to AFS inodes. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Implement shared-writeable mmap	David Howells	2017-11-13	3	-9/+54
\| \| \| \| \| \| \| \| \| \| \| \|	Implement shared-writeable mmap for AFS. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Get rid of the afs_writeback record	David Howells	2017-11-13	7	-395/+412
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Get rid of the afs_writeback record that kAFS is using to match keys with writes made by that key. Instead, keep a list of keys that have a file open for writing and/or sync'ing and iterate through those. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Introduce a file-private data record	David Howells	2017-11-13	6	-20/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce a file-private data record for kAFS and put the key into it rather than storing the key in file->private_data. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Use a dynamic port if 7001 is in use	Marc Dionne	2017-11-13	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is not required that the afs client operate on port 7001. The port could be in use because another kernel or userspace client has already bound to it. If the port is in use, just fallback to using a dynamic port. Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Fix directory read/modify race	David Howells	2017-11-13	4	-8/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because parsing of the directory wasn't being done under any sort of lock, the pages holding the directory content can get invalidated whilst the parsing is ongoing. Further, the directory page check function gets called outside of the page lock, so if the page gets cleared or updated, this may return reports of bad magic numbers in the directory page. Also, the directory may change size whilst checking and parsing are ongoing, so more care needs to be taken here. Fix this by: (1) Perform the page check from the page filling function before we set PageUptodate and drop the page lock. (2) Check for the file having shrunk and the page having been abandoned before checking the page contents. (3) Lock the page whilst parsing it for the directory iterator. Whilst we're at it, add a tracepoint to report check failure. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Trace the sending of pages	David Howells	2017-11-13	2	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a pair of tracepoints to log the sending of pages for an FS.StoreData or FS.StoreData64 operation. Tracepoint afs_send_pages notes each set of pages added to the operation. There may be several of these per operation as we get up at most 8 contiguous pages in one go because the bvec we're using is on the stack. Tracepoint afs_sent_pages notes the end of adding data from a whole run of pages to the operation and the completion of the request phase. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Trace the initiation and completion of client calls	David Howells	2017-11-13	5	-20/+233
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add tracepoints to trace the initiation and completion of client calls within the kafs filesystem. The afs_make_vl_call tracepoint watches calls to the volume location database server. The afs_make_fs_call tracepoint watches calls to the file server. The afs_call_done tracepoint watches for call completion. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Fix documentation on # vs % prefix in mount source specification	David Howells	2017-11-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The documentation that describes the #-prefix and the %-prefix used when specifying the source to mount is has the descriptions the wrong way round. Switch them over. Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Fix total-length calculation for multiple-page send	David Howells	2017-11-13	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix the total-length calculation in afs_make_call() when the operation being dispatched has data from a series of pages attached. Despite the patched code looking like that it should reduce mathematically to the current code, it doesn't because the 32-bit unsigned arithmetic being used to calculate the page-offset-difference doesn't correctly extend to a 64-bit value when the result is effectively negative. Without this, some FS.StoreData operations that span multiple pages fail, reporting too little or too much data. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Only progress call state at end of Tx phase from rxrpc callback	David Howells	2017-11-13	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Only progress the AFS call state at the end of Tx phase from the callback passed to rxrpc_kernel_send_data() rather than setting it before the last data send call. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Make use of the YFS service upgrade to fully support IPv6	David Howells	2017-11-13	6	-10/+428
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	YFS VL servers offer an upgraded Volume Location service that can return IPv6 addresses to fileservers and volume servers in addition to IPv4 addresses using the YFSVL.GetEndpoints operation which we should use if it's available. To this end: (1) Make rxrpc_kernel_recv_data() return the call's current service ID so that the caller can detect service upgrade and see what the service was upgraded to. (2) When we see a VL server address we haven't seen before, send a VL.GetCapabilities operation to it with the service upgrade bit set. If we get an upgrade to the YFS VL service, change the service ID in the address list for that address to use the upgraded service and set a flag to note that this appears to be a YFS-compatible server. (3) If, when a server's addresses are being looked up, we note that we previously detected a YFS-compatible server, then send the YFSVL.GetEndpoints operation rather than VL.GetAddrsU. (4) Build a fileserver address list from the reply of YFSVL.GetEndpoints, including both IPv4 and IPv6 addresses. Volume server addresses are discarded. (5) The address list is sorted by address and port now, instead of just address. This allows multiple servers on the same host sitting on different ports. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Overhaul volume and server record caching and fileserver rotation	David Howells	2017-11-13	26	-2575/+2795
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current code assumes that volumes and servers are per-cell and are never shared, but this is not enforced, and, indeed, public cells do exist that are aliases of each other. Further, an organisation can, say, set up a public cell and a private cell with overlapping, but not identical, sets of servers. The difference is purely in the database attached to the VL servers. The current code will malfunction if it sees a server in two cells as it assumes global address -> server record mappings and that each server is in just one cell. Further, each server may have multiple addresses - and may have addresses of different families (IPv4 and IPv6, say). To this end, the following structural changes are made: (1) Server record management is overhauled: (a) Server records are made independent of cell. The namespace keeps track of them, volume records have lists of them and each vnode has a server on which its callback interest currently resides. (b) The cell record no longer keeps a list of servers known to be in that cell. (c) The server records are now kept in a flat list because there's no single address to sort on. (d) Server records are now keyed by their UUID within the namespace. (e) The addresses for a server are obtained with the VL.GetAddrsU rather than with VL.GetEntryByName, using the server's UUID as a parameter. (f) Cached server records are garbage collected after a period of non-use and are counted out of existence before purging is allowed to complete. This protects the work functions against rmmod. (g) The servers list is now in /proc/fs/afs/servers. (2) Volume record management is overhauled: (a) An RCU-replaceable server list is introduced. This tracks both servers and their coresponding callback interests. (b) The superblock is now keyed on cell record and numeric volume ID. (c) The volume record is now tied to the superblock which mounts it, and is activated when mounted and deactivated when unmounted. This makes it easier to handle the cache cookie without causing a double-use in fscache. (d) The volume record is loaded from the VLDB using VL.GetEntryByNameU to get the server UUID list. (e) The volume name is updated if it is seen to have changed when the volume is updated (the update is keyed on the volume ID). (3) The vlocation record is got rid of and VLDB records are no longer cached. Sufficient information is stored in the volume record, though an update to a volume record is now no longer shared between related volumes (volumes come in bundles of three: R/W, R/O and backup). and the following procedural changes are made: (1) The fileserver cursor introduced previously is now fleshed out and used to iterate over fileservers and their addresses. (2) Volume status is checked during iteration, and the server list is replaced if a change is detected. (3) Server status is checked during iteration, and the address list is replaced if a change is detected. (4) The abort code is saved into the address list cursor and -ECONNABORTED returned in afs_make_call() if a remote abort happened rather than translating the abort into an error message. This allows actions to be taken depending on the abort code more easily. (a) If a VMOVED abort is seen then this is handled by rechecking the volume and restarting the iteration. (b) If a VBUSY, VRESTARTING or VSALVAGING abort is seen then this is handled by sleeping for a short period and retrying and/or trying other servers that might serve that volume. A message is also displayed once until the condition has cleared. (c) If a VOFFLINE abort is seen, then this is handled as VBUSY for the moment. (d) If a VNOVOL abort is seen, the volume is rechecked in the VLDB to see if it has been deleted; if not, the fileserver is probably indicating that the volume couldn't be attached and needs salvaging. (e) If statfs() sees one of these aborts, it does not sleep, but rather returns an error, so as not to block the umount program. (5) The fileserver iteration functions in vnode.c are now merged into their callers and more heavily macroised around the cursor. vnode.c is removed. (6) Operations on a particular vnode are serialised on that vnode because the server will lock that vnode whilst it operates on it, so a second op sent will just have to wait. (7) Fileservers are probed with FS.GetCapabilities before being used. This is where service upgrade will be done. (8) A callback interest on a fileserver is set up before an FS operation is performed and passed through to afs_make_call() so that it can be set on the vnode if the operation returns a callback. The callback interest is passed through to afs_iget() also so that it can be set there too. In general, record updating is done on an as-needed basis when we try to access servers, volumes or vnodes rather than offloading it to work items and special threads. Notes: (1) Pre AFS-3.4 servers are no longer supported, though this can be added back if necessary (AFS-3.4 was released in 1998). (2) VBUSY is retried forever for the moment at intervals of 1s. (3) /proc/fs/afs/<cell>/servers no longer exists. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Move server rotation code into its own file	David Howells	2017-11-13	3	-250/+255
\| \| \| \| \| \| \| \| \| \| \| \|	Move server rotation code into its own file. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Add an address list concept	David Howells	2017-11-13	12	-561/+844
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add an RCU replaceable address list structure to hold a list of server addresses. The list also holds the To this end: (1) A cell's VL server address list can be loaded directly via insmod or echo to /proc/fs/afs/cells or dynamically from a DNS query for AFSDB or SRV records. (2) Anyone wanting to use a cell's VL server address must wait until the cell record comes online and has tried to obtain some addresses. (3) An FS server's address list, for the moment, has a single entry that is the key to the server list. This will change in the future when a server is instead keyed on its UUID and the VL.GetAddrsU operation is used. (4) An 'address cursor' concept is introduced to handle iteration through the address list. This is passed to the afs_make_call() as, in the future, stuff (such as abort code) that doesn't outlast the call will be returned in it. In the future, we might want to annotate the list with information about how each address fares. We might then want to propagate such annotations over address list replacement. Whilst we're at it, we allow IPv6 addresses to be specified in colon-delimited lists by enclosing them in square brackets. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Overhaul cell database management	David Howells	2017-11-13	6	-317/+704
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Overhaul the way that the in-kernel AFS client keeps track of cells in the following manner: (1) Cells are now held in an rbtree to make walking them quicker and RCU managed (though this is probably overkill). (2) Cells now have a manager work item that: (A) Looks after fetching and refreshing the VL server list. (B) Manages cell record lifetime, including initialising and destruction. (B) Manages cell record caching whereby threads are kept around for a certain time after last use and then destroyed. (C) Manages the FS-Cache index cookie for a cell. It is not permitted for a cookie to be in use twice, so we have to be careful to not allow a new cell record to exist at the same time as an old record of the same name. (3) Each AFS network namespace is given a manager work item that manages the cells within it, maintaining a single timer to prod cells into updating their DNS records. This uses the reduce_timer() facility to make the timer expire at the soonest timed event that needs happening. (4) When a module is being unloaded, cells and cell managers are now counted out using dec_after_work() to make sure the module text is pinned until after the data structures have been cleaned up. (5) Each cell's VL server list is now protected by a seqlock rather than a semaphore. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Overhaul permit caching	David Howells	2017-11-13	9	-187/+244
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Overhaul permit caching in AFS by making it per-vnode and sharing permit lists where possible. When most of the fileserver operations are called, they return a status structure indicating the (revised) details of the vnode or vnodes involved in the operation. This includes the access mark derived from the ACL (named CallerAccess in the protocol definition file). This is cacheable and if the ACL changes, the server will tell us that it is breaking the callback promise, at which point we can discard the currently cached permits. With this patch, the afs_permits structure has, at the end, an array of { key, CallerAccess } elements, sorted by key pointer. This is then cached in a hash table so that it can be shared between vnodes with the same access permits. Permit lists can only be shared if they contain the exact same set of key->CallerAccess mappings. Note that that table is global rather than being per-net_ns. If the keys in a permit list cross net_ns boundaries, there is no problem sharing the cached permits, since the permits are just integer masks. Since permit lists pin keys, the permit cache also makes it easier for a future patch to find all occurrences of a key and remove them by means of setting the afs_permits::invalidated flag and then clearing the appropriate key pointer. In such an event, memory barriers will need adding. Lastly, the permit caching is skipped if the server has sent either a vnode-specific or an entire-server callback since the start of the operation. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Overhaul the callback handling	David Howells	2017-11-13	14	-794/+443
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Overhaul the AFS callback handling by the following means: (1) Don't give up callback promises on vnodes that we are no longer using, rather let them just expire on the server or let the server break them. This is actually more efficient for the server as the callback lookup is expensive if there are lots of extant callbacks. (2) Only give up the callback promises we have from a server when the server record is destroyed. Then we can just give up all the callback promises on it in one go. (3) Servers can end up being shared between cells if cells are aliased, so don't add all the vnodes being backed by a particular server into a big FID-indexed tree on that server as there may be duplicates. Instead have each volume instance (~= superblock) register an interest in a server as it starts to make use of it and use this to allow the processor for callbacks from the server to find the superblock and thence the inode corresponding to the FID being broken by means of ilookup_nowait(). (4) Rather than iterating over the entire callback list when a mass-break comes in from the server, maintain a counter of mass-breaks in afs_server (cb_seq) and make afs_validate() check it against the copy in afs_vnode. It would be nice not to have to take a read_lock whilst doing this, but that's tricky without using RCU. (5) Save a ref on the fileserver we're using for a call in the afs_call struct so that we can access its cb_s_break during call decoding. (6) Write-lock around callback and status storage in a vnode and read-lock around getattr so that we don't see the status mid-update. This has the following consequences: (1) Data invalidation isn't seen until someone calls afs_validate() on a vnode. Unfortunately, we need to use a key to query the server, but getting one from a background thread is tricky without caching loads of keys all over the place. (2) Mass invalidation isn't seen until someone calls afs_validate(). (3) Callback breaking is going to hit the inode_hash_lock quite a bit. Could this be replaced with rcu_read_lock() since inodes are destroyed under RCU conditions. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Rename struct afs_call server member to cm_server	David Howells	2017-11-13	3	-11/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rename the server member of struct afs_call to cm_server as we're only going to be using it for incoming calls for the Cache Manager service. This makes it easier to differentiate from the pointer to the target server for the client, which will point to a different structure to allow for callback handling. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Fix the afs_uuid struct to make the char-sized fields signed	David Howells	2017-11-13	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In AFS's encoding of a UUID, the eight 'char' fields are all signed, so represent them with __s8 rather than __u8. This makes the compiler sign-extend them correctly when XDR-encoding them. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Connect up the CB.ProbeUuid	David Howells	2017-11-13	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The handler for the CB.ProbeUuid operation in the cache manager is implemented, but isn't listed in the switch-statement of operation selection, so won't be used. Fix this by adding it. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Potentially return call->reply[0] from afs_make_call()	David Howells	2017-11-13	2	-11/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If call->ret_reply0 is set, return call->reply[0] on success. Change the return type of afs_make_call() to long so that this can be passed back without bit loss and then cast to a pointer if required. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Condense afs_call's reply{,2,3,4} into an array	David Howells	2017-11-13	5	-77/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Condense struct afs_call's reply anchor members - reply{,2,3,4} - into an array. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Consolidate abort_to_error translators	David Howells	2017-11-13	6	-77/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The AFS abort code space is shared across all services, so there's no need for separate abort_to_error translators for each service. Consolidate them into a single function and remove the function pointers for them. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Allow IPv6 address specification of VL servers	David Howells	2017-11-13	5	-26/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow VL server specifications to be given IPv6 addresses as well as IPv4 addresses, for example as: echo add foo.org 1111:2222:3333:0:4444:5555:6666:7777 >/proc/fs/afs/cells Note that ':' is the expected separator for separating IPv4 addresses, but if a ',' is detected or no '.' is detected in the string, the delimiter is switched to ','. This also works with DNS AFSDB or SRV record strings fetched by upcall from userspace. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Keep and pass sockaddr_rxrpc addresses rather than in_addr	David Howells	2017-11-13	10	-130/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Keep and pass sockaddr_rxrpc addresses around rather than keeping and passing in_addr addresses to allow for the use of IPv6 and non-standard port numbers in future. This also allows the port and service_id fields to be removed from the afs_call struct. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Update the cache index structure	David Howells	2017-11-13	4	-251/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update the cache index structure in the following ways: (1) Don't use the volume name followed by the volume type as levels in the cache index. Volumes can be renamed. Use the volume ID instead. (2) Don't store the VLDB data for a volume in the tree. If the volume database should be cached locally, then it should be done in a separate tree. (3) Expand the volume ID stored in the cache to 64 bits. (4) Expand the file/vnode ID stored in the cache to 96 bits. (5) Increment the cache structure version number to 1. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Add some protocol defs	David Howells	2017-11-13	3	-9/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add some protocol definitions, including max field lengths, flag defs, an XDR-encoded UUID def, more VL operation IDs and more fileserver abort codes. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Push the net ns pointer to more places	David Howells	2017-11-13	11	-56/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Push the network namespace pointer to more places in AFS, including the afs_server structure (which doesn't hold a ref on the netns). In particular, afs_put_cell() now takes requires a net ns parameter so that it can safely alter the netns after decrementing the cell usage count - the cell will be deallocated by a background thread after being cached for a period, which means that it's not safe to access it after reducing its usage count. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Note the cell in the superblock info also	David Howells	2017-11-13	2	-24/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Keep a reference to the cell in the superblock info structure in addition to the volume and net pointers. This will make it easier to clean up in a future patch in which afs_put_volume() will need the cell pointer. Whilst we're at it, make the cell and volume getting functions return a pointer to the object got to make the call sites look neater. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Fix server reaping	David Howells	2017-11-13	3	-10/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix server reaping and make sure it's all done before we start trying to purge cells, given that servers currently pin cells. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Close the rxrpc socket only after purging the servers	David Howells	2017-11-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Close the rxrpc socket only after we've purged the server records (and also cell and volume records which might refer to servers) so that we can give up the callbacks on each server. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	afs: Lay the groundwork for supporting network namespaces	David Howells	2017-11-13	16	-492/+603
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Lay the groundwork for supporting network namespaces (netns) to the AFS filesystem by moving various global features to a network-namespace struct (afs_net) and providing an instance of this as a temporary global variable that everything uses via accessor functions for the moment. The following changes have been made: (1) Store the netns in the superblock info. This will be obtained from the mounter's nsproxy on a manual mount and inherited from the parent superblock on an automount. (2) The cell list is made per-netns. It can be viewed through /proc/net/afs/cells and also be modified by writing commands to that file. (3) The local workstation cell is set per-ns in /proc/net/afs/rootcell. This is unset by default. (4) The 'rootcell' module parameter, which sets a cell and VL server list modifies the init net namespace, thereby allowing an AFS root fs to be theoretically used. (5) The volume location lists and the file lock manager are made per-netns. (6) The AF_RXRPC socket and associated I/O bits are made per-ns. The various workqueues remain global for the moment. Changes still to be made: (1) /proc/fs/afs/ should be moved to /proc/net/afs/ and a symlink emplaced from the old name. (2) A per-netns subsys needs to be registered for AFS into which it can store its per-netns data. (3) Rather than the AF_RXRPC socket being opened on module init, it needs to be opened on the creation of a superblock in that netns. (4) The socket needs to be closed when the last superblock using it is destroyed and all outstanding client calls on it have been completed. This prevents a reference loop on the namespace. (5) It is possible that several namespaces will want to use AFS, in which case each one will need its own UDP port. These can either be set through /proc/net/afs/cm_port or the kernel can pick one at random. The init_ns gets 7001 by default. Other issues that need resolving: (1) The DNS keyring needs net-namespacing. (2) Where do upcalls go (eg. DNS request-key upcall)? (3) Need something like open_socket_in_file_ns() syscall so that AFS command line tools attempting to operate on an AFS file/volume have their RPC calls go to the right place. Signed-off-by: David Howells <dhowells@redhat.com>
\| *	Pass mode to wait_on_atomic_t() action funcs and provide default actions	David Howells	2017-11-13	14	-98/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make wait_on_atomic_t() pass the TASK_* mode onto its action function as an extra argument and make it 'unsigned int throughout. Also, consolidate a bunch of identical action functions into a default function that can do the appropriate thing for the mode. Also, change the argument name in the bit_wait*() function declarations to reflect the fact that it's the mode and not the bit number. [Peter Z gives this a grudging ACK, but thinks that the whole atomic_t wait should be done differently, though he's not immediately sure as to how] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> cc: Ingo Molnar <mingo@kernel.org>
\| *	Merge remote-tracking branch 'tip/timers/core' into afs-next	David Howells	2017-11-13	251	-1860/+1810
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These AFS patches need the timer_reduce() patch from timers/core. Signed-off-by: David Howells <dhowells@redhat.com>
* \| \	Merge tag 'pinctrl-v4.15-1' of ↵	Linus Torvalds	2017-11-16	96	-3161/+6241
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control updates from Linus Walleij: "This is the bulk of pin control changes for the v4.15 kernel cycle: Core: - The pin control Kconfig entry PINCTRL is now turned into a menuconfig option. This obviously has the implication of making the subsystem menu visible in menuconfig. This is happening because of two things: (a) Intel have started to deploy and depend on pin controllers in a way that is affecting users directly. This happens on the highly integrated laptop chipsets named after geographical places: baytrail, broxton, cannonlake, cedarfork, cherryview, denverton, geminilake, lewisburg, merrifield, sunrisepoint... It started a while back and now it is ever more evident that this is crucial infrastructure for x86 laptops and not an embedded obscurity anymore. Users need to be aware. (b) Pin control expanders on I2C and SPI that are arch-agnostic. Currently Semtech SX150X and Microchip MCP28x08 but more are expected. Users will have to be able to configure these in directly for their set-up. - Just go and select GPIOLIB now that we made sure that GPIOLIB is a very vanilla subsystem. Do not depend on it, if we need it, select it. - Exposing the pin control subsystem in menuconfig uncovered a bunch of obscure bugs that are now hopefully fixed, all more or less pertaining to Blackfin. - Unified namespace for cross-calls between pin control and GPIO. - New support for clock skew/delay generic DT bindings and generic pin config options for this. - Minor documentation improvements. Various: - The Renesas SH-PFC pin controller has evolved a lot. It seems Renesas are churning out new SoCs by the minute. - A bunch of non-critical fixes for the Rockchip driver. - Improve the use of library functions instead of open coding. - Support the MCP28018 variant in the MCP28x08 driver. - Static constifying" * tag 'pinctrl-v4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (91 commits) pinctrl: gemini: Fix missing pad descriptions pinctrl: Add some depends on HAS_IOMEM pinctrl: samsung/s3c24xx: add CONFIG_OF dependency pinctrl: gemini: Fix GMAC groups pinctrl: qcom: spmi-gpio: Add pmi8994 gpio support pinctrl: ti-iodelay: remove redundant unused variable dev pinctrl: max77620: Use common error handling code in max77620_pinconf_set() pinctrl: gemini: Implement clock skew/delay config pinctrl: gemini: Use generic DT parser pinctrl: Add skew-delay pin config and bindings pinctrl: armada-37xx: Add edge both type gpio irq support pinctrl: uniphier: remove eMMC hardware reset pin-mux pinctrl: rockchip: Add iomux-route switching support for rk3288 pinctrl: intel: Add Intel Cedar Fork PCH pin controller support pinctrl: intel: Make offset to interrupt status register configurable pinctrl: sunxi: Enforce the strict mode by default pinctrl: sunxi: Disable strict mode for old pinctrl drivers pinctrl: sunxi: Introduce the strict flag pinctrl: sh-pfc: Save/restore registers for PSCI system suspend pinctrl: sh-pfc: r8a7796: Use generic IOCTRL register description ...
\| * \| \|	pinctrl: gemini: Fix missing pad descriptions	Linus Walleij	2017-11-13	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A pretty clever static checker found a bug in my patch: I added more bits to a bitmask but didn't extend the array indexed to the same bitmask. Fixes: 756a024f3983 ("pinctrl: gemini: Fix GMAC groups") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
\| * \| \|	pinctrl: Add some depends on HAS_IOMEM	Linus Walleij	2017-11-13	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some compilation fallout from UM Linux (which does not have IOMEM) makes it necessary to depend on HAS_IOMEM for drivers that doesn't have other factors restricting their selection. Cc: Phil Reid <preid@electromag.com.au> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk> Reported-by: R. Daneel Olivaw <kbuild-all@01.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
\| * \| \|	pinctrl: samsung/s3c24xx: add CONFIG_OF dependency	Arnd Bergmann	2017-11-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The driver fails to build without CONFIG_OF: drivers/pinctrl/samsung/pinctrl-samsung.c: In function 'samsung_gpiolib_register': drivers/pinctrl/samsung/pinctrl-samsung.c:936:5: error: 'struct gpio_chip' has no member named 'of_node' This configuration is now possible since we can now select the PINCTRL subsystem on S3C24xx machines other than the device tree based ones. Fixes: d219b924611a ("pinctrl: change Kconfig PINCTRL variable to a menuconfig") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
\| * \| \|	Merge branch 'gpio-irqchip-rework' of /home/linus/linux-gpio into devel	Linus Walleij	2017-11-09	88	-516/+3159
\| \|\ \ \
\| * \| \| \|	pinctrl: gemini: Fix GMAC groups	Linus Walleij	2017-11-08	1	-25/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The GMII groups need to be split across GMAC0 and GMAC1 since GMAC0 is always available but GMAC1 masks GPIO2 lines 0-7 so we might want just one interface out. Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
\| * \| \| \|	pinctrl: qcom: spmi-gpio: Add pmi8994 gpio support	Rajendra Nayak	2017-11-08	2	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update the binding and driver for pmi8994-gpios Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
\| * \| \| \|	pinctrl: ti-iodelay: remove redundant unused variable dev	Colin Ian King	2017-11-08	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The pointer dev is being assigned but is never used, hence it is redundant and can be removed. Cleans up clang warnings: drivers/pinctrl/ti/pinctrl-ti-iodelay.c:582:2: warning: Value stored to 'dev' is never read drivers/pinctrl/ti/pinctrl-ti-iodelay.c:701:2: warning: Value stored to 'dev' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
\| * \| \| \|	pinctrl: max77620: Use common error handling code in max77620_pinconf_set()	Markus Elfring	2017-11-08	1	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add a jump target so that a specific error message is stored only once at the end of this function implementation. * Replace two calls of the function "dev_err" by goto statements. * Adjust two condition checks. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
\| * \| \| \|	pinctrl: gemini: Implement clock skew/delay config	Linus Walleij	2017-11-08	2	-6/+182
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This enabled pin config on the Gemini driver and implements pin skew/delay so that the ethernet pins clocking can be properly configured. Acked-by: Hans Ulli Kroll <ulli.kroll@googlemail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
\| * \| \| \|	pinctrl: gemini: Use generic DT parser	Linus Walleij	2017-11-08	2	-62/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can just use the generic Device Tree parser code in this driver and save some code. Acked-by: Hans Ulli Kroll <ulli.kroll@googlemail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
\| * \| \| \|	pinctrl: Add skew-delay pin config and bindings	Linus Walleij	2017-11-08	3	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some pin controllers (such as the Gemini) can control the expected clock skew and output delay on certain pins with a sub-nanosecond granularity. This is typically done by shunting in a number of double inverters in front of or behind the pin. Make it possible to configure this with a generic binding. Cc: devicetree@vger.kernel.org Acked-by: Rob Herring <robh@kernel.org> Acked-by: Hans Ulli Kroll <ulli.kroll@googlemail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
\| * \| \| \|	pinctrl: armada-37xx: Add edge both type gpio irq support	Ken Ma	2017-10-31	1	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current edge both type gpio irqs which need to swap polarity in each interrupt are not supported, this patch adds edge both type gpio irq support. Signed-off-by: Ken Ma <make@marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>