summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'nfsd-5.13-1' of ↵Linus Torvalds2021-05-058-191/+280
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull more nfsd updates from Chuck Lever: "Additional fixes and clean-ups for NFSD since tags/nfsd-5.13, including a fix to grant read delegations for files open for writing" * tag 'nfsd-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Fix null pointer dereference in svc_rqst_free() SUNRPC: fix ternary sign expansion bug in tracing nfsd: Fix fall-through warnings for Clang nfsd: grant read delegations to clients holding writes nfsd: reshuffle some code nfsd: track filehandle aliasing in nfs4_files nfsd: hash nfs4_files by inode number nfsd: ensure new clients break delegations nfsd: removed unused argument in nfsd_startup_generic() nfsd: remove unused function svcrdma: Pass a useful error code to the send_err tracepoint svcrdma: Rename goto labels in svc_rdma_sendto() svcrdma: Don't leak send_ctxt on Send errors
| * SUNRPC: Fix null pointer dereference in svc_rqst_free()Yunjian Wang2021-04-231-1/+2
| | | | | | | | | | | | | | | | | | | | | | When alloc_pages_node() returns null in svc_rqst_alloc(), the null rq_scratch_page pointer will be dereferenced when calling put_page() in svc_rqst_free(). Fix it by adding a null check. Addresses-Coverity: ("Dereference after null check") Fixes: 5191955d6fc6 ("SUNRPC: Prepare for xdr_stream-style decoding on the server-side") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * SUNRPC: fix ternary sign expansion bug in tracingDan Carpenter2021-04-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code is supposed to pass negative "err" values for tracing but it passes positive values instead. The problem is that the trace_svcsock_tcp_send() function takes a long but "err" is an int and "sent" is a u32. The negative is first type promoted to u32 so it becomes a high positive then it is promoted to long and it stays positive. Fix this by casting "err" directly to long. Fixes: 998024dee197 ("SUNRPC: Add more svcsock tracepoints") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: Fix fall-through warnings for ClangGustavo A. R. Silva2021-04-202-0/+2
| | | | | | | | | | | | | | | | | | | | In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple warnings by explicitly adding a couple of break statements instead of just letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: grant read delegations to clients holding writesJ. Bruce Fields2021-04-192-14/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | It's OK to grant a read delegation to a client that holds a write, as long as it's the only client holding the write. We originally tried to do this in commit 94415b06eb8a ("nfsd4: a client's own opens needn't prevent delegations"), which had to be reverted in commit 6ee65a773096 ("Revert "nfsd4: a client's own opens needn't prevent delegations""). Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: reshuffle some codeJ. Bruce Fields2021-04-191-117/+118
| | | | | | | | | | | | | | | | | | | | | | No change in behavior, I'm just moving some code around to avoid forward references in a following patch. (To do someday: figure out how to split up nfs4state.c. It's big and disorganized.) Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: track filehandle aliasing in nfs4_filesJ. Bruce Fields2021-04-192-9/+30
| | | | | | | | | | | | | | | | | | | | | | | | It's unusual but possible for multiple filehandles to point to the same file. In that case, we may end up with multiple nfs4_files referencing the same inode. For delegation purposes it will turn out to be useful to flag those cases. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: hash nfs4_files by inode numberJ. Bruce Fields2021-04-192-16/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nfs4_file structure is per-filehandle, not per-inode, because the spec requires open and other state to be per filehandle. But it will turn out to be convenient for nfs4_files associated with the same inode to be hashed to the same bucket, so let's hash on the inode instead of the filehandle. Filehandle aliasing is rare, so that shouldn't have much performance impact. (If you have a ton of exported filesystems, though, and all of them have a root with inode number 2, could that get you an overlong hash chain? Perhaps this (and the v4 open file cache) should be hashed on the inode pointer instead.) Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: ensure new clients break delegationsJ. Bruce Fields2021-04-161-5/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If nfsd already has an open file that it plans to use for IO from another, it may not need to do another vfs open, but it still may need to break any delegations in case the existing opens are for another client. Symptoms are that we may incorrectly fail to break a delegation on a write open from a different client, when the delegation-holding client already has a write open. Fixes: 28df3d1539de ("nfsd: clients don't need to break their own delegations") Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: removed unused argument in nfsd_startup_generic()Vasily Averin2021-04-151-4/+4
| | | | | | | | | | | | | | | | Since commit 501cb1849f86 ("nfsd: rip out the raparms cache") nrservs is not used in nfsd_startup_generic() Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: remove unused functionJiapeng Chong2021-04-151-9/+0
| | | | | | | | | | | | | | | | | | | | | | Fix the following clang warning: fs/nfsd/nfs4state.c:6276:1: warning: unused function 'end_offset' [-Wunused-function]. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * svcrdma: Pass a useful error code to the send_err tracepointChuck Lever2021-04-141-3/+9
| | | | | | | | | | | | | | Capture error codes in @ret, which is passed to the send_err tracepoint, so that they can be logged when something goes awry. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * svcrdma: Rename goto labels in svc_rdma_sendto()Chuck Lever2021-04-141-12/+12
| | | | | | | | | | | | | | Clean up: Make the goto labels consistent with other similar functions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * svcrdma: Don't leak send_ctxt on Send errorsChuck Lever2021-04-141-4/+4
| | | | | | | | | | | | Address a rare send_ctxt leak in the svc_rdma_sendto() error paths. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* | Merge tag '5.13-rc-smb3-part2' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2021-05-0514-33/+447
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cifs updates from Steve French: "Ten CIFS/SMB3 changes - including two marked for stable - including some important multichannel fixes, as well as support for handle leases (deferred close) and shutdown support: - some important multichannel fixes - support for handle leases (deferred close) - shutdown support (which is also helpful since it enables multiple xfstests) - enable negotiating stronger encryption by default (GCM256) - improve wireshark debugging by allowing more options for root to dump decryption keys SambaXP and the SMB3 Plugfest test event are going on now so I am expecting more patches over the next few days due to extra testing (including more multichannel fixes)" * tag '5.13-rc-smb3-part2' of git://git.samba.org/sfrench/cifs-2.6: fs/cifs: Fix resource leak Cifs: Fix kernel oops caused by deferred close for files. cifs: fix regression when mounting shares with prefix paths cifs: use echo_interval even when connection not ready. cifs: detect dead connections only when echoes are enabled. smb3.1.1: allow dumping keys for multiuser mounts smb3.1.1: allow dumping GCM256 keys to improve debugging of encrypted shares cifs: add shutdown support cifs: Deferred close for files smb3.1.1: enable negotiating stronger encryption by default
| * | fs/cifs: Fix resource leakKhaled ROMDHANI2021-05-041-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The -EIO error return path is leaking memory allocated to page. Fix this by moving the allocation block after the check of cifs_forced_shutdown. Addresses-Coverity: ("Resource leak") Fixes: 087f757b0129 ("cifs: add shutdown support") Signed-off-by: Khaled ROMDHANI <khaledromdhani216@gmail.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | Cifs: Fix kernel oops caused by deferred close for files.Rohith Surabattula2021-05-044-5/+33
| | | | | | | | | | | | | | | | | | | | | | | | Fix regression issue caused by deferred close for files. Signed-off-by: Rohith Surabattula <rohiths@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | cifs: fix regression when mounting shares with prefix pathsPaulo Alcantara2021-05-043-13/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit 315db9a05b7a ("cifs: fix leak in cifs_smb3_do_mount() ctx") revealed an existing bug when mounting shares that contain a prefix path or DFS links. cifs_setup_volume_info() requires the @devname to contain the full path (UNC + prefix) to update the fs context with the new UNC and prepath values, however we were passing only the UNC path (old_ctx->UNC) in @device thus discarding any prefix paths. Instead of concatenating both old_ctx->{UNC,prepath} and pass it in @devname, just keep the dup'ed values of UNC and prepath in cifs_sb->ctx after calling smb3_fs_context_dup(), and fix smb3_parse_devname() to correctly parse and not leak the new UNC and prefix paths. Cc: <stable@vger.kernel.org> # v5.11+ Fixes: 315db9a05b7a ("cifs: fix leak in cifs_smb3_do_mount() ctx") Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Acked-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | cifs: use echo_interval even when connection not ready.Shyam Prasad N2021-05-031-11/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the tcp connection is not ready to send requests, we keep retrying echo with an interval of zero. This seems unnecessary, and this fix changes the interval between echoes to what is specified as echo_interval. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | cifs: detect dead connections only when echoes are enabled.Shyam Prasad N2021-05-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can detect server unresponsiveness only if echoes are enabled. Echoes can be disabled under two scenarios: 1. The connection is low on credits, so we've disabled echoes/oplocks. 2. The connection has not seen any request till now (other than negotiate/sess-setup), which is when we enable these two, based on the credits available. So this fix will check for dead connection, only when echo is enabled. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> CC: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Steve French <stfrench@microsoft.com>
| * | smb3.1.1: allow dumping keys for multiuser mountsSteve French2021-05-031-20/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | When mounted multiuser it is hard to dump keys for the other sessions which makes it hard to debug using network traces (e.g. using wireshark). Suggested-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | smb3.1.1: allow dumping GCM256 keys to improve debugging of encrypted sharesSteve French2021-05-032-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we were only able to dump CCM or GCM-128 keys (see "smbinfo keys" e.g.) to allow network debugging (e.g. wireshark) of mounts to SMB3.1.1 encrypted shares. But with the addition of GCM-256 support, we have to be able to dump 32 byte instead of 16 byte keys which requires adding an additional ioctl for that. Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | cifs: add shutdown supportSteve French2021-05-039-2/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various filesystem support the shutdown ioctl which is used by various xfstests. The shutdown ioctl sets a flag on the superblock which prevents open, unlink, symlink, hardlink, rmdir, create etc. on the file system until unmount and remounted. The two flags supported in this patch are: FSOP_GOING_FLAGS_LOGFLUSH and FSOP_GOING_FLAGS_NOLOGFLUSH which require very little other than blocking new operations (since we do not cache writes to metadata on the client with cifs.ko). FSOP_GOING_FLAGS_DEFAULT is not supported yet, but could be added in the future but would need to call syncfs or equivalent to write out pending data on the mount. With this patch various xfstests now work including tests 043 through 046 for example. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>
| * | cifs: Deferred close for filesRohith Surabattula2021-05-036-3/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When file is closed, SMB2 close request is not sent to server immediately and is deferred for acregmax defined interval. When file is reopened by same process for read or write, the file handle is reused if an oplock is held. When client receives a oplock/lease break, file is closed immediately if reference count is zero, else oplock is downgraded. Signed-off-by: Rohith Surabattula <rohiths@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | smb3.1.1: enable negotiating stronger encryption by defaultSteve French2021-04-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that stronger encryption (gcm256) has been more broadly tested, and confirmed to work with multiple servers (Windows and Azure for example), enable it by default. Although gcm256 is the second choice we offer (after gcm128 which should be faster), this change allows mounts to server which are configured to require the strongest encryption to work (without changing a module load parameter). Signed-off-by: Steve French <stfrench@microsoft.com> Suggested-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
* | | Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2021-05-0523-172/+1295
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull virtio updates from Michael Tsirkin: "A bunch of new drivers including vdpa support for block and virtio-vdpa. Beginning of vq kick (aka doorbell) mapping support. Misc fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (40 commits) virtio_pci_modern: correct sparse tags for notify virtio_pci_modern: __force cast the notify mapping vDPA/ifcvf: get_config_size should return dev specific config size vDPA/ifcvf: enable Intel C5000X-PL virtio-block for vDPA vDPA/ifcvf: deduce VIRTIO device ID when probe vdpa_sim_blk: add support for vdpa management tool vdpa_sim_blk: handle VIRTIO_BLK_T_GET_ID vdpa_sim_blk: implement ramdisk behaviour vdpa: add vdpa simulator for block device vhost/vdpa: Remove the restriction that only supports virtio-net devices vhost/vdpa: use get_config_size callback in vhost_vdpa_config_validate() vdpa: add get_config_size callback in vdpa_config_ops vdpa_sim: cleanup kiovs in vdpasim_free() vringh: add vringh_kiov_length() helper vringh: implement vringh_kiov_advance() vringh: explain more about cleaning riov and wiov vringh: reset kiov 'consumed' field in __vringh_iov() vringh: add 'iotlb_lock' to synchronize iotlb accesses vdpa_sim: use iova module to allocate IOVA addresses vDPA/ifcvf: deduce VIRTIO device ID from pdev ids ...
| * | | virtio_pci_modern: correct sparse tags for notifyMichael S. Tsirkin2021-05-042-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When switching virtio_pci_modern to use a helper for mappings we lost an __iomem tag. Restore it. Reported-by: kernel test robot <lkp@intel.com> Fixes: 9e3bb9b79a71 ("virtio_pci_modern: introduce helper to map vq notify area") Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | virtio_pci_modern: __force cast the notify mappingMichael S. Tsirkin2021-05-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When switching virtio_pci_modern to use a helper for mappings we lost an __iomem tag. We should restore it. However, virtio_pci_modern is playing tricks by hiding an iomem pointer in a regular vq->priv pointer. Which is okay as long as it's all contained within a single file, but we need to __force cast the value otherwise we'll get sparse warnings. Reported-by: kernel test robot <lkp@intel.com> Fixes: 7dca6c0ea96b ("virtio-pci library: switch to use vp_modern_map_vq_notify()") Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vDPA/ifcvf: get_config_size should return dev specific config sizeZhu Lingshan2021-05-031-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | get_config_size() should return the size based on the decected device type. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210419063326.3748-4-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vDPA/ifcvf: enable Intel C5000X-PL virtio-block for vDPAZhu Lingshan2021-05-032-2/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit enabled Intel FPGA SmartNIC C5000X-PL virtio-block for vDPA. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210419063326.3748-3-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vDPA/ifcvf: deduce VIRTIO device ID when probeZhu Lingshan2021-05-032-12/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit deduces VIRTIO device ID as device type when probe, then ifcvf_vdpa_get_device_id() can simply return the ID. ifcvf_vdpa_get_features() and ifcvf_vdpa_get_config_size() can work properly based on the device ID. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20210419063326.3748-2-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
| * | | vdpa_sim_blk: add support for vdpa management toolStefano Garzarella2021-05-031-13/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable the user to create vDPA block simulator devices using the vdpa management tool: # Show vDPA supported devices $ vdpa mgmtdev show vdpasim_blk: supported_classes block # Create a vDPA block device named as 'blk0' from the management # device vdpasim: $ vdpa dev add mgmtdev vdpasim_blk name blk0 # Show the info of the 'blk0' device just created $ vdpa dev show blk0 -jp { "dev": { "blk0": { "type": "block", "mgmtdev": "vdpasim_blk", "vendor_id": 0, "max_vqs": 1, "max_vq_size": 256 } } } # Delete the vDPA device after its use $ vdpa dev del blk0 Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-15-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
| * | | vdpa_sim_blk: handle VIRTIO_BLK_T_GET_IDStefano Garzarella2021-05-031-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Handle VIRTIO_BLK_T_GET_ID request, always answering the "vdpa_blk_sim" string. Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-14-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vdpa_sim_blk: implement ramdisk behaviourStefano Garzarella2021-05-031-18/+146
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous implementation wrote only the status of each request. This patch implements a more accurate block device simulator, providing a ramdisk-like behavior and adding input validation. Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-13-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vdpa: add vdpa simulator for block deviceMax Gurtovoy2021-05-033-0/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will allow running vDPA for virtio block protocol. It's a preliminary implementation with a simple request handling: for each request, only the status (last byte) is set. It's always set to VIRTIO_BLK_S_OK. Also input validation is missing and will be added in the next commits. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> [sgarzare: various cleanups/fixes] Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-12-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vhost/vdpa: Remove the restriction that only supports virtio-net devicesXie Yongji2021-05-031-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the config checks are done by the vDPA drivers, we can remove the virtio-net restriction and we should be able to support all kinds of virtio devices. <linux/virtio_net.h> is not needed anymore, but we need to include <linux/slab.h> to avoid compilation failures. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-11-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
| * | | vhost/vdpa: use get_config_size callback in vhost_vdpa_config_validate()Stefano Garzarella2021-05-031-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's use the new 'get_config_size()' callback available instead of using the 'virtio_id' to get the size of the device config space. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-10-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
| * | | vdpa: add get_config_size callback in vdpa_config_opsStefano Garzarella2021-05-035-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new callback is used to get the size of the configuration space of vDPA devices. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-9-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
| * | | vdpa_sim: cleanup kiovs in vdpasim_free()Stefano Garzarella2021-05-031-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vringh_getdesc_iotlb() allocates memory to store the kvec, that is freed with vringh_kiov_cleanup(). vringh_getdesc_iotlb() is able to reuse a kvec previously allocated, so in order to avoid to allocate the kvec for each request, we are not calling vringh_kiov_cleanup() when we finished to handle a request, but we should call it when we free the entire device. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-8-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vringh: add vringh_kiov_length() helperStefano Garzarella2021-05-031-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new helper returns the total number of bytes covered by a vringh_kiov. Suggested-by: Jason Wang <jasowang@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-7-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vringh: implement vringh_kiov_advance()Stefano Garzarella2021-05-032-12/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some cases, it may be useful to provide a way to skip a number of bytes in a vringh_kiov. Let's implement vringh_kiov_advance() for this purpose, reusing the code from vringh_iov_xfer(). We replace that code calling the new vringh_kiov_advance(). Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-6-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vringh: explain more about cleaning riov and wiovStefano Garzarella2021-05-031-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | riov and wiov can be reused with subsequent calls of vringh_getdesc_*(). Let's add a paragraph in the documentation of these functions to better explain when riov and wiov need to be cleaned up. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-5-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vringh: reset kiov 'consumed' field in __vringh_iov()Stefano Garzarella2021-05-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __vringh_iov() overwrites the contents of riov and wiov, in fact it resets the 'i' and 'used' fields, but also the 'consumed' field should be reset to avoid an inconsistent state. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-4-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vringh: add 'iotlb_lock' to synchronize iotlb accessesStefano Garzarella2021-05-033-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Usually iotlb accesses are synchronized with a spinlock. Let's request it as a new parameter in vringh_set_iotlb() and hold it when we navigate the iotlb in iotlb_translate() to avoid race conditions with any new additions/deletions of ranges from the ioltb. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-3-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vdpa_sim: use iova module to allocate IOVA addressesStefano Garzarella2021-05-033-42/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The identical mapping used until now created issues when mapping different virtual pages with the same physical address. To solve this issue, we can use the iova module, to handle the IOVA allocation. For simplicity we use an IOVA allocator with byte granularity. We add two new functions, vdpasim_map_range() and vdpasim_unmap_range(), to handle the IOVA allocation and the registration into the IOMMU/IOTLB. These functions are used by dma_map_ops callbacks. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-2-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vDPA/ifcvf: deduce VIRTIO device ID from pdev idsZhu Lingshan2021-05-032-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit deduces the VIRTIO device ID of a probed device from its pdev device ids. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20210317094933.16417-8-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
| * | | vDPA/ifcvf: verify mandatory feature bits for vDPAZhu Lingshan2021-05-033-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vDPA requres VIRTIO_F_ACCESS_PLATFORM as a must, this commit examines this when set features. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210317094933.16417-7-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vDPA/ifcvf: fetch device feature bits when probeZhu Lingshan2021-05-033-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit would read and store device feature bits when probe. rename ifcvf_get_features() to ifcvf_get_hw_features(), it reads and stores features of the probed device. new ifcvf_get_features() simply returns stored feature bits. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210317094933.16417-6-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vDPA/ifcvf: remove the version number stringZhu Lingshan2021-05-031-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit removes the version number string, using kernel version is enough. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210317094933.16417-5-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * | | vDPA/ifcvf: rename original IFCVF dev ids to N3000 idsZhu Lingshan2021-05-032-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IFCVF driver probes multiple types of devices now, to distinguish the original device driven by IFCVF from others, it is renamed as "N3000". Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210317094933.16417-4-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>