| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
"This set includes a couple minor cleanups, and dropping the
interruptible from a wait_event that waits for an event from the
userspace cluster management"
* tag 'dlm-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: remove BUG() before panic()
dlm: Switch to using wait_event()
fs:dlm:remove unneeded semicolon in rcom.c
dlm: user: Replace zero-length array with flexible-array member
dlm: dlm_internal: Replace zero-length array with flexible-array member
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Building a kernel with clang sometimes fails with an objtool error in dlm:
fs/dlm/lock.o: warning: objtool: revert_lock_pc()+0xbd: can't find jump dest instruction at .text+0xd7fc
The problem is that BUG() never returns and the compiler knows
that anything after it is unreachable, however the panic still
emits some code that does not get fully eliminated.
Having both BUG() and panic() is really pointless as the BUG()
kills the current process and the subsequent panic() never hits.
In most cases, we probably don't really want either and should
replace the DLM_ASSERT() statements with WARN_ON(), as has
been done for some of them.
Remove the BUG() here so the user at least sees the panic message
and we can reliably build randconfig kernels.
Fixes: e7fd41792fc0 ("[DLM] The core of the DLM for GFS2/CLVM")
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: clang-built-linux@googlegroups.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Teigland <teigland@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We saw an issue in a production server on a customer deployment where
DLM 4.0.7 gets "stuck" and unable to join new lockspaces.
There is no useful response for the dlm in do_event() if
wait_event_interruptible() is interrupted, so switch to
wait_event().
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Fix the following coccicheck warning:
fs/dlm/rcom.c:566:2-3: Unneeded semicolon
Signed-off-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Pull cifs updates from Steve French:
"22 changesets, 2 for stable.
Includes big performance improvement for large i/o when using
multichannel, also includes DFS fixes"
* tag '5.8-rc-smb3-fixes-part-1' of git://git.samba.org/sfrench/cifs-2.6: (22 commits)
cifs: update internal module version number
cifs: multichannel: try to rebind when reconnecting a channel
cifs: multichannel: use pointer for binding channel
smb3: remove static checker warning
cifs: multichannel: move channel selection above transport layer
cifs: multichannel: always zero struct cifs_io_parms
cifs: dump Security Type info in DebugData
smb3: fix incorrect number of credits when ioctl MaxOutputResponse > 64K
smb3: default to minimum of two channels when multichannel specified
cifs: multichannel: move channel selection in function
cifs: fix minor typos in comments and log messages
smb3: minor update to compression header definitions
cifs: minor fix to two debug messages
cifs: Standardize logging output
smb3: Add new parm "nodelete"
cifs: move some variables off the stack in smb2_ioctl_query_info
cifs: reduce stack use in smb2_compound_op
cifs: get rid of unused parameter in reconn_setup_dfs_targets()
cifs: handle hostnames that resolve to same ip in failover
cifs: set up next DFS target before generic_ip_connect()
...
|
| | |
| | |
| | |
| | |
| | |
| | | |
To 2.27
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
first steps in trying to make channels properly reconnect.
* add cifs_ses_find_chan() function to find the enclosing cifs_chan
struct it belongs to
* while we have the session lock and are redoing negprot and
sess.setup in smb2_reconnect() redo the binding of channels.
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add a cifs_chan pointer in struct cifs_ses that points to the channel
currently being bound if ses->binding is true.
Previously it was always the channel past the established count.
This will make reconnecting (and rebinding) a channel easier later on.
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Remove static checker warning pointed out by Dan Carpenter:
The patch feeaec621c09: "cifs: multichannel: move channel selection
above transport layer" from Apr 24, 2020, leads to the following
static checker warning:
fs/cifs/smb2pdu.c:149 smb2_hdr_assemble()
error: we previously assumed 'tcon->ses' could be null (see line 133)
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
CC: Aurelien Aptel <aptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Move the channel (TCP_Server_Info*) selection from the tranport
layer to higher in the call stack so that:
- credit handling is done with the server that will actually be used
to send.
* ->wait_mtu_credit
* ->set_credits / set_credits
* ->add_credits / add_credits
* add_credits_and_wake_if
- potential reconnection (smb2_reconnect) done when initializing a
request is checked and done with the server that will actually be
used to send.
To do this:
- remove the cifs_pick_channel() call out of compound_send_recv()
- select channel and pass it down by adding a cifs_pick_channel(ses)
call in:
- smb311_posix_mkdir
- SMB2_open
- SMB2_ioctl
- __SMB2_close
- query_info
- SMB2_change_notify
- SMB2_flush
- smb2_async_readv (if none provided in context param)
- SMB2_read (if none provided in context param)
- smb2_async_writev (if none provided in context param)
- SMB2_write (if none provided in context param)
- SMB2_query_directory
- send_set_info
- SMB2_oplock_break
- SMB311_posix_qfs_info
- SMB2_QFS_info
- SMB2_QFS_attr
- smb2_lockv
- SMB2_lease_break
- smb2_compound_op
- smb2_set_ea
- smb2_ioctl_query_info
- smb2_query_dir_first
- smb2_query_info_comound
- smb2_query_symlink
- cifs_writepages
- cifs_write_from_iter
- cifs_send_async_read
- cifs_read
- cifs_readpages
- add TCP_Server_Info *server param argument to:
- cifs_send_recv
- compound_send_recv
- SMB2_open_init
- SMB2_query_info_init
- SMB2_set_info_init
- SMB2_close_init
- SMB2_ioctl_init
- smb2_iotcl_req_init
- SMB2_query_directory_init
- SMB2_notify_init
- SMB2_flush_init
- build_qfs_info_req
- smb2_hdr_assemble
- smb2_reconnect
- fill_small_buf
- smb2_plain_req_init
- __smb2_plain_req_init
The read/write codepath is different than the rest as it is using
pages, io iterators and async calls. To deal with those we add a
server pointer in the cifs_writedata/cifs_readdata/cifs_io_parms
context struct and set it in:
- cifs_writepages (wdata)
- cifs_write_from_iter (wdata)
- cifs_readpages (rdata)
- cifs_send_async_read (rdata)
The [rw]data->server pointer is eventually copied to
cifs_io_parms->server to pass it down to SMB2_read/SMB2_write.
If SMB2_read/SMB2_write is called from a different place that doesn't
set the server field it will pick a channel.
Some places do not pick a channel and just use ses->server or
cifs_ses_server(ses). All cifs_ses_server(ses) calls are in codepaths
involving negprot/sess.setup.
- SMB2_negotiate (binding channel)
- SMB2_sess_alloc_buffer (binding channel)
- SMB2_echo (uses provided one)
- SMB2_logoff (uses master)
- SMB2_tdis (uses master)
(list not exhaustive)
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
SMB2_read/SMB2_write check and use cifs_io_parms->server, which might
be uninitialized memory.
This change makes all callers zero-initialize the struct.
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Currently the end user is unaware with what sec type the
cifs share is mounted if no sec=<type> option is parsed.
With this patch one can easily check from DebugData.
Example:
1) Name: x.x.x.x Uses: 1 Capability: 0x8001f3fc Session Status: 1 Security type: RawNTLMSSP
Signed-off-by: Kenneth D'souza <kdsouza@redhat.com>
Signed-off-by: Roberto Bergantinos Corpas <rbergant@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Acked-by: Aurelien Aptel <aaptel@suse.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We were not checking to see if ioctl requests asked for more than
64K (ie when CIFSMaxBufSize was > 64K) so when setting larger
CIFSMaxBufSize then ioctls would fail with invalid parameter errors.
When requests ask for more than 64K in MaxOutputResponse then we
need to ask for more than 1 credit.
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When "multichannel" is specified on mount, make sure to default to
at least two channels.
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This commit moves channel picking code in separate function.
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fix four minor typos in comments and log messages
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
MS-SMB2 specification was updated in March. Make minor additions
and corrections to compression related definitions in smb2pdu.h
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Joe Perches pointed out that we were missing a newline
at the end of two debug messages
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use pr_fmt to standardize all logging for fs/cifs.
Some logging output had no CIFS: specific prefix.
Now all output has one of three prefixes:
o CIFS:
o CIFS: VFS:
o Root-CIFS:
Miscellanea:
o Convert printks to pr_<level>
o Neaten macro definitions
o Remove embedded CIFS: prefixes from formats
o Convert "illegal" to "invalid"
o Coalesce formats
o Add missing '\n' format terminations
o Consolidate multiple cifs_dbg continuations into single calls
o More consistent use of upper case first word output logging
o Multiline statement argument alignment and wrapping
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In order to handle workloads where it is important to make sure that
a buggy app did not delete content on the drive, the new mount option
"nodelete" allows standard permission checks on the server to work,
but prevents on the client any attempts to unlink a file or delete
a directory on that mount point. This can be helpful when running
a little understood app on a network mount that contains important
content that should not be deleted.
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Move some large data structures off the stack and into dynamically
allocated memory in the function smb2_ioctl_query_info
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Move a lot of structures and arrays off the stack and into a dynamically
allocated structure instead.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The target iterator parameter "it" is not used in
reconn_setup_dfs_targets(), so just remove it.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In order to support reconnect to hostnames that resolve to same ip
address, besides relying on the currently set hostname to match DFS
targets, attempt to resolve the targets and then match their addresses
with the reconnected server ip address.
For instance, if we have two hostnames "FOO" and "BAR", and both
resolve to the same ip address, we would be able to handle failover in
DFS paths like
\\FOO\dfs\link1 -> [ \BAZ\share2 (*), \BAR\share1 ]
\\FOO\dfs\link2 -> [ \BAZ\share2 (*), \FOO\share1 ]
so when "BAZ" is no longer accessible, link1 and link2 would get
reconnected despite having different target hostnames.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If we mount a very specific DFS link
\\FS0.FOO.COM\dfs\link -> \FS0\share1, \FS1\share2
where its target list contains NB names ("FS0" & "FS1") rather than
FQDN ones ("FS0.FOO.COM" & "FS1.FOO.COM"), we end up connecting to
\FOO\share1 but server->hostname will have "FOO.COM". The reason is
because both "FS0" and "FS0.FOO.COM" resolve to same IP address and
they share same TCP server connection, but "FS0.FOO.COM" was the first
hostname set -- which is OK.
However, if the echo thread timeouts and we still have a good
connection to "FS0", in cifs_reconnect()
rc = generic_ip_connect(server) -> success
if (rc) {
...
reconn_inval_dfs_target(server, cifs_sb, &tgt_list,
&tgt_it);
...
}
...
it successfully reconnects to "FS0" server but does not set up next
DFS target - which should be the same target server "\FS0\share1" -
and server->hostname remains set to "FS0.FOO.COM" rather than "FS0",
as reconn_inval_dfs_target() would have it set to "FS0" if called
earlier.
Finally, in __smb2_reconnect(), the reconnect of tcons would fail
because tcon->ses->server->hostname (FS0.FOO.COM) does not match DFS
target's hostname (FS0).
Fix that by calling reconn_inval_dfs_target() before
generic_ip_connect() so server->hostname will get updated correctly
prior to reconnecting its tcons in __smb2_reconnect().
With "cifs: handle hostnames that resolve to same ip in failover"
patch
- The above problem would not occur.
- We could save an DNS query to find out that they both resolve to
the same ip address.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The variable rc is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The "nolease" mount option is only supported for SMB2+ mounts.
Fail with appropriate error message if vers=1.0 option is passed.
Signed-off-by: Kenneth D'souza <kdsouza@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull AFS updates from David Howells:
"There's some core VFS changes which affect a couple of filesystems:
- Make the inode hash table RCU safe and providing some RCU-safe
accessor functions. The search can then be done without taking the
inode_hash_lock. Care must be taken because the object may be being
deleted and no wait is made.
- Allow iunique() to avoid taking the inode_hash_lock.
- Allow AFS's callback processing to avoid taking the inode_hash_lock
when using the inode table to find an inode to notify.
- Improve Ext4's time updating. Konstantin Khlebnikov said "For now,
I've plugged this issue with try-lock in ext4 lazy time update.
This solution is much better."
Then there's a set of changes to make a number of improvements to the
AFS driver:
- Improve callback (ie. third party change notification) processing
by:
(a) Relying more on the fact we're doing this under RCU and by
using fewer locks. This makes use of the RCU-based inode
searching outlined above.
(b) Moving to keeping volumes in a tree indexed by volume ID
rather than a flat list.
(c) Making the server and volume records logically part of the
cell. This means that a server record now points directly at
the cell and the tree of volumes is there. This removes an N:M
mapping table, simplifying things.
- Improve keeping NAT or firewall channels open for the server
callbacks to reach the client by actively polling the fileserver on
a timed basis, instead of only doing it when we have an operation
to process.
- Improving detection of delayed or lost callbacks by including the
parent directory in the list of file IDs to be queried when doing a
bulk status fetch from lookup. We can then check to see if our copy
of the directory has changed under us without us getting notified.
- Determine aliasing of cells (such as a cell that is pointed to be a
DNS alias). This allows us to avoid having ambiguity due to
apparently different cells using the same volume and file servers.
- Improve the fileserver rotation to do more probing when it detects
that all of the addresses to a server are listed as non-responsive.
It's possible that an address that previously stopped responding
has become responsive again.
Beyond that, lay some foundations for making some calls asynchronous:
- Turn the fileserver cursor struct into a general operation struct
and hang the parameters off of that rather than keeping them in
local variables and hang results off of that rather than the call
struct.
- Implement some general operation handling code and simplify the
callers of operations that affect a volume or a volume component
(such as a file). Most of the operation is now done by core code.
- Operations are supplied with a table of operations to issue
different variants of RPCs and to manage the completion, where all
the required data is held in the operation object, thereby allowing
these to be called from a workqueue.
- Put the standard "if (begin), while(select), call op, end" sequence
into a canned function that just emulates the current behaviour for
now.
There are also some fixes interspersed:
- Don't let the EACCES from ICMP6 mapping reach the user as such,
since it's confusing as to whether it's a filesystem error. Convert
it to EHOSTUNREACH.
- Don't use the epoch value acquired through probing a server. If we
have two servers with the same UUID but in different cells, it's
hard to draw conclusions from them having different epoch values.
- Don't interpret the argument to the CB.ProbeUuid RPC as a
fileserver UUID and look up a fileserver from it.
- Deal with servers in different cells having the same UUIDs. In the
event that a CB.InitCallBackState3 RPC is received, we have to
break the callback promises for every server record matching that
UUID.
- Don't let afs_statfs return values that go below 0.
- Don't use running fileserver probe state to make server selection
and address selection decisions on. Only make decisions on final
state as the running state is cleared at the start of probing"
Acked-by: Al Viro <viro@zeniv.linux.org.uk> (fs/inode.c part)
* tag 'afs-next-20200604' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (27 commits)
afs: Adjust the fileserver rotation algorithm to reprobe/retry more quickly
afs: Show more a bit more server state in /proc/net/afs/servers
afs: Don't use probe running state to make decisions outside probe code
afs: Fix afs_statfs() to not let the values go below zero
afs: Fix the by-UUID server tree to allow servers with the same UUID
afs: Reorganise volume and server trees to be rooted on the cell
afs: Add a tracepoint to track the lifetime of the afs_volume struct
afs: Detect cell aliases 3 - YFS Cells with a canonical cell name op
afs: Detect cell aliases 2 - Cells with no root volumes
afs: Detect cell aliases 1 - Cells with root volumes
afs: Implement client support for the YFSVL.GetCellName RPC op
afs: Retain more of the VLDB record for alias detection
afs: Fix handling of CB.ProbeUuid cache manager op
afs: Don't get epoch from a server because it may be ambiguous
afs: Build an abstraction around an "operation" concept
afs: Rename struct afs_fs_cursor to afs_operation
afs: Remove the error argument from afs_protocol_error()
afs: Set error flag rather than return error from file status decode
afs: Make callback processing more efficient.
afs: Show more information in /proc/net/afs/servers
...
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Adjust the fileserver rotation algorithm so that if we've tried all the
addresses on a server (cumulatively over multiple operations) until we've
run out of untried addresses, immediately reprobe all that server's
interfaces and retry the op at least once before we move onto the next
server.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Display more information about the state of a server record, including the
flags, rtt and break counter plus the probe state for each server in
/proc/net/afs/servers.
Rearrange the server flags a bit to make them easier to read at a glance in
the proc file.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Don't use the running state for fileserver probes to make decisions about
which server to use as the state is cleared at the start of a probe and
also intermediate values might be misleading.
Instead, add a separate 'latest known' rtt in the afs_server struct and a
flag to indicate if the server is known to be responding and update these
as and when we know what to change them to.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Fix afs_statfs() so that the value for f_bavail and f_bfree don't go
"negative" if the number of blocks in use by a volume exceeds the max quota
for that volume.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Whilst it shouldn't happen, it is possible for multiple fileservers to
share a UUID, particularly if an entire cell has been duplicated, UUIDs and
all. In such a case, it's not necessarily possible to map the effect of
the CB.InitCallBackState3 incoming RPC to a specific server unambiguously
by UUID and thus to a specific cell.
Indeed, there's a problem whereby multiple server records may need to
occupy the same spot in the rb_tree rooted in the afs_net struct.
Fix this by allowing servers to form a list, with the head of the list in
the tree. When the front entry in the list is removed, the second in the
list just replaces it. afs_init_callback_state() then just goes down the
line, poking each server in the list.
This means that some servers will be unnecessarily poked, unfortunately.
An alternative would be to route by call parameters.
Reported-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Fixes: d2ddc776a458 ("afs: Overhaul volume and server record caching and fileserver rotation")
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Reorganise afs_volume objects such that they're in a tree keyed on volume
ID, rooted at on an afs_cell object rather than being in multiple trees,
each of which is rooted on an afs_server object.
afs_server structs become per-cell and acquire a pointer to the cell.
The process of breaking a callback then starts with finding the server by
its network address, following that to the cell and then looking up each
volume ID in the volume tree.
This is simpler than the afs_vol_interest/afs_cb_interest N:M mapping web
and allows those structs and the code for maintaining them to be simplified
or removed.
It does make a couple of things a bit more tricky, though:
(1) Operations now start with a volume, not a server, so there can be more
than one answer as to whether or not the server we'll end up using
supports the FS.InlineBulkStatus RPC.
(2) CB RPC operations that specify the server UUID. There's still a tree
of servers by UUID on the afs_net struct, but the UUIDs in it aren't
guaranteed unique.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add a tracepoint to track the lifetime of the afs_volume struct.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
YFS Volume Location servers have an operation by which the cell name may be
queried. Use this to find out what a YFS server thinks the canonical cell
name should be.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Implement the second phase of cell alias detection. This part handles
alias detection for cells that don't have root.cell volumes and so we have
to find some other volume or fileserver to query.
We take the first volume from each such cell and attempt to look it up in
the new cell. If found, we compare the records, if they are the same, we
judge the cell names to be aliases.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Put in the first phase of cell alias detection. This part handles alias
detection for cells that have root.cell volumes (which is expected to be
likely).
When a cell becomes newly active, it is probed for its root.cell volume,
and if it has one, this volume is compared against other root.cell volumes
to find out if the list of fileserver UUIDs have any in common - and if
that's the case, do the address lists of those fileservers have any
addresses in common. If they do, the new cell is adjudged to be an alias
of the old cell and the old cell is used instead.
Comparing is aided by the server list in struct afs_server_list being
sorted in UUID order and the addresses in the fileserver address lists
being sorted in address order.
The cell then retains the afs_volume object for the root.cell volume, even
if it's not mounted for future alias checking.
This necessary because:
(1) Whilst fileservers have UUIDs that are meant to be globally unique, in
practice they are not because cells get cloned without changing the
UUIDs - so afs_server records need to be per cell.
(2) Sometimes the DNS is used to make cell aliases - but if we don't know
they're the same, we may end up with multiple superblocks and multiple
afs_server records for the same thing, impairing our ability to
deliver callback notifications of third party changes
(3) The fileserver RPC API doesn't contain the cell name, so it can't tell
us which cell it's notifying and can't see that a change made to to
one cell should notify the same client that's also accessed as the
other cell.
Reported-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Implement client support for the YFSVL.GetCellName RPC operation by which
YFS permits the canonical cell name to be queried from a VL server.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Save more bits from the volume location database record obtained for a
server so that we can use this information in cell alias detection.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The AFS filesystem driver is handling the CB.ProbeUuid request incorrectly.
The UUID presented in the request is that of the cache manager, not the
fileserver, so afs_deliver_cb_probe_uuid() shouldn't be using that UUID to
look up the server.
Fix this by looking up the server by address instead.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Don't get the epoch from a server, particularly one that we're looking up
by UUID, as UUIDs may be ambiguous and may map to more than one server - so
we can't draw any conclusions from it.
Reported-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Turn the afs_operation struct into the main way that most fileserver
operations are managed. Various things are added to the struct, including
the following:
(1) All the parameters and results of the relevant operations are moved
into it, removing corresponding fields from the afs_call struct.
afs_call gets a pointer to the op.
(2) The target volume is made the main focus of the operation, rather than
the target vnode(s), and a bunch of op->vnode->volume are made
op->volume instead.
(3) Two vnode records are defined (op->file[]) for the vnode(s) involved
in most operations. The vnode record (struct afs_vnode_param)
contains:
- The vnode pointer.
- The fid of the vnode to be included in the parameters or that was
returned in the reply (eg. FS.MakeDir).
- The status and callback information that may be returned in the
reply about the vnode.
- Callback break and data version tracking for detecting
simultaneous third-parth changes.
(4) Pointers to dentries to be updated with new inodes.
(5) An operations table pointer. The table includes pointers to functions
for issuing AFS and YFS-variant RPCs, handling the success and abort
of an operation and handling post-I/O-lock local editing of a
directory.
To make this work, the following function restructuring is made:
(A) The rotation loop that issues calls to fileservers that can be found
in each function that wants to issue an RPC (such as afs_mkdir()) is
extracted out into common code, in a new file called fs_operation.c.
(B) The rotation loops, such as the one in afs_mkdir(), are replaced with
a much smaller piece of code that allocates an operation, sets the
parameters and then calls out to the common code to do the actual
work.
(C) The code for handling the success and failure of an operation are
moved into operation functions (as (5) above) and these are called
from the core code at appropriate times.
(D) The pseudo inode getting stuff used by the dynamic root code is moved
over into dynroot.c.
(E) struct afs_iget_data is absorbed into the operation struct and
afs_iget() expects to be given an op pointer and a vnode record.
(F) Point (E) doesn't work for the root dir of a volume, but we know the
FID in advance (it's always vnode 1, unique 1), so a separate inode
getter, afs_root_iget(), is provided to special-case that.
(G) The inode status init/update functions now also take an op and a vnode
record.
(H) The RPC marshalling functions now, for the most part, just take an
afs_operation struct as their only argument. All the data they need
is held there. The result delivery functions write their answers
there as well.
(I) The call is attached to the operation and then the operation core does
the waiting.
And then the new operation code is, for the moment, made to just initialise
the operation, get the appropriate vnode I/O locks and do the same rotation
loop as before.
This lays the foundation for the following changes in the future:
(*) Overhauling the rotation (again).
(*) Support for asynchronous I/O, where the fileserver rotation must be
done asynchronously also.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
As a prelude to implementing asynchronous fileserver operations in the afs
filesystem, rename struct afs_fs_cursor to afs_operation.
This struct is going to form the core of the operation management and is
going to acquire more members in later.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Remove the error argument from afs_protocol_error() as it's always
-EBADMSG.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Set a flag in the call struct to indicate an unmarshalling error rather
than return and handle an error from the decoding of file statuses. This
flag is checked on a successful return from the delivery function.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
afs_vol_interest objects represent the volume IDs currently being accessed
from a fileserver. These hold lists of afs_cb_interest objects that
repesent the superblocks using that volume ID on that server.
When a callback notification from the server telling of a modification by
another client arrives, the volume ID specified in the notification is
looked up in the server's afs_vol_interest list. Through the
afs_cb_interest list, the relevant superblocks can be iterated over and the
specific inode looked up and marked in each one.
Make the following efficiency improvements:
(1) Hold rcu_read_lock() over the entire processing rather than locking it
each time.
(2) Do all the callbacks for each vid together rather than individually.
Each volume then only needs to be looked up once.
(3) afs_vol_interest objects are now stored in an rb_tree rather than a
flat list to reduce the lookup step count.
(4) afs_vol_interest lookup is now done with RCU, but because it's in an
rb_tree which may rotate under us, a seqlock is used so that if it
changes during the walk, we repeat the walk with a lock held.
With this and the preceding patch which adds RCU-based lookups in the inode
cache, target volumes/vnodes can be taken without the need to take any
locks, except on the target itself.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Show more information in /proc/net/afs/servers to make it easier to see
what's going on with the server probing.
Signed-off-by: David Howells <dhowells@redhat.com>
|