summaryrefslogtreecommitdiffstats
path: root/fs/nfs/nfs42proc.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* NFSv4: Prevent NULL-pointer dereference in nfs42_complete_copies()Yanjun Zhang2024-10-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On the node of an NFS client, some files saved in the mountpoint of the NFS server were copied to another location of the same NFS server. Accidentally, the nfs42_complete_copies() got a NULL-pointer dereference crash with the following syslog: [232064.838881] NFSv4: state recovery failed for open file nfs/pvc-12b5200d-cd0f-46a3-b9f0-af8f4fe0ef64.qcow2, error = -116 [232064.839360] NFSv4: state recovery failed for open file nfs/pvc-12b5200d-cd0f-46a3-b9f0-af8f4fe0ef64.qcow2, error = -116 [232066.588183] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058 [232066.588586] Mem abort info: [232066.588701] ESR = 0x0000000096000007 [232066.588862] EC = 0x25: DABT (current EL), IL = 32 bits [232066.589084] SET = 0, FnV = 0 [232066.589216] EA = 0, S1PTW = 0 [232066.589340] FSC = 0x07: level 3 translation fault [232066.589559] Data abort info: [232066.589683] ISV = 0, ISS = 0x00000007 [232066.589842] CM = 0, WnR = 0 [232066.589967] user pgtable: 64k pages, 48-bit VAs, pgdp=00002000956ff400 [232066.590231] [0000000000000058] pgd=08001100ae100003, p4d=08001100ae100003, pud=08001100ae100003, pmd=08001100b3c00003, pte=0000000000000000 [232066.590757] Internal error: Oops: 96000007 [#1] SMP [232066.590958] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm vhost_net vhost vhost_iotlb tap tun ipt_rpfilter xt_multiport ip_set_hash_ip ip_set_hash_net xfrm_interface xfrm6_tunnel tunnel4 tunnel6 esp4 ah4 wireguard libcurve25519_generic veth xt_addrtype xt_set nf_conntrack_netlink ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_bitmap_port ip_set_hash_ipport dummy ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs iptable_filter sch_ingress nfnetlink_cttimeout vport_gre ip_gre ip_tunnel gre vport_geneve geneve vport_vxlan vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_conncount dm_round_robin dm_service_time dm_multipath xt_nat xt_MASQUERADE nft_chain_nat nf_nat xt_mark xt_conntrack xt_comment nft_compat nft_counter nf_tables nfnetlink ocfs2 ocfs2_nodemanager ocfs2_stackglue iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipmi_ssif nbd overlay 8021q garp mrp bonding tls rfkill sunrpc ext4 mbcache jbd2 [232066.591052] vfat fat cas_cache cas_disk ses enclosure scsi_transport_sas sg acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler ip_tables vfio_pci vfio_pci_core vfio_virqfd vfio_iommu_type1 vfio dm_mirror dm_region_hash dm_log dm_mod nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc fuse xfs libcrc32c ast drm_vram_helper qla2xxx drm_kms_helper syscopyarea crct10dif_ce sysfillrect ghash_ce sysimgblt sha2_ce fb_sys_fops cec sha256_arm64 sha1_ce drm_ttm_helper ttm nvme_fc igb sbsa_gwdt nvme_fabrics drm nvme_core i2c_algo_bit i40e scsi_transport_fc megaraid_sas aes_neon_bs [232066.596953] CPU: 6 PID: 4124696 Comm: 10.253.166.125- Kdump: loaded Not tainted 5.15.131-9.cl9_ocfs2.aarch64 #1 [232066.597356] Hardware name: Great Wall .\x93\x8e...RF6260 V5/GWMSSE2GL1T, BIOS T656FBE_V3.0.18 2024-01-06 [232066.597721] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [232066.598034] pc : nfs4_reclaim_open_state+0x220/0x800 [nfsv4] [232066.598327] lr : nfs4_reclaim_open_state+0x12c/0x800 [nfsv4] [232066.598595] sp : ffff8000f568fc70 [232066.598731] x29: ffff8000f568fc70 x28: 0000000000001000 x27: ffff21003db33000 [232066.599030] x26: ffff800005521ae0 x25: ffff0100f98fa3f0 x24: 0000000000000001 [232066.599319] x23: ffff800009920008 x22: ffff21003db33040 x21: ffff21003db33050 [232066.599628] x20: ffff410172fe9e40 x19: ffff410172fe9e00 x18: 0000000000000000 [232066.599914] x17: 0000000000000000 x16: 0000000000000004 x15: 0000000000000000 [232066.600195] x14: 0000000000000000 x13: ffff800008e685a8 x12: 00000000eac0c6e6 [232066.600498] x11: 0000000000000000 x10: 0000000000000008 x9 : ffff8000054e5828 [232066.600784] x8 : 00000000ffffffbf x7 : 0000000000000001 x6 : 000000000a9eb14a [232066.601062] x5 : 0000000000000000 x4 : ffff70ff8a14a800 x3 : 0000000000000058 [232066.601348] x2 : 0000000000000001 x1 : 54dce46366daa6c6 x0 : 0000000000000000 [232066.601636] Call trace: [232066.601749] nfs4_reclaim_open_state+0x220/0x800 [nfsv4] [232066.601998] nfs4_do_reclaim+0x1b8/0x28c [nfsv4] [232066.602218] nfs4_state_manager+0x928/0x10f0 [nfsv4] [232066.602455] nfs4_run_state_manager+0x78/0x1b0 [nfsv4] [232066.602690] kthread+0x110/0x114 [232066.602830] ret_from_fork+0x10/0x20 [232066.602985] Code: 1400000d f9403f20 f9402e61 91016003 (f9402c00) [232066.603284] SMP: stopping secondary CPUs [232066.606936] Starting crashdump kernel... [232066.607146] Bye! Analysing the vmcore, we know that nfs4_copy_state listed by destination nfs_server->ss_copies was added by the field copies in handle_async_copy(), and we found a waiting copy process with the stack as: PID: 3511963 TASK: ffff710028b47e00 CPU: 0 COMMAND: "cp" #0 [ffff8001116ef740] __switch_to at ffff8000081b92f4 #1 [ffff8001116ef760] __schedule at ffff800008dd0650 #2 [ffff8001116ef7c0] schedule at ffff800008dd0a00 #3 [ffff8001116ef7e0] schedule_timeout at ffff800008dd6aa0 #4 [ffff8001116ef860] __wait_for_common at ffff800008dd166c #5 [ffff8001116ef8e0] wait_for_completion_interruptible at ffff800008dd1898 #6 [ffff8001116ef8f0] handle_async_copy at ffff8000055142f4 [nfsv4] #7 [ffff8001116ef970] _nfs42_proc_copy at ffff8000055147c8 [nfsv4] #8 [ffff8001116efa80] nfs42_proc_copy at ffff800005514cf0 [nfsv4] #9 [ffff8001116efc50] __nfs4_copy_file_range.constprop.0 at ffff8000054ed694 [nfsv4] The NULL-pointer dereference was due to nfs42_complete_copies() listed the nfs_server->ss_copies by the field ss_copies of nfs4_copy_state. So the nfs4_copy_state address ffff0100f98fa3f0 was offset by 0x10 and the data accessed through this pointer was also incorrect. Generally, the ordered list nfs4_state_owner->so_states indicate open(O_RDWR) or open(O_WRITE) states are reclaimed firstly by nfs4_reclaim_open_state(). When destination state reclaim is failed with NFS_STATE_RECOVERY_FAILED and copies are not deleted in nfs_server->ss_copies, the source state may be passed to the nfs42_complete_copies() process earlier, resulting in this crash scene finally. To solve this issue, we add a list_head nfs_server->ss_src_copies for a server-to-server copy specially. Fixes: 0e65a32c8a56 ("NFS: handle source server reboot") Signed-off-by: Yanjun Zhang <zhangyanjun@cestc.cn> Reviewed-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
* nfs42: client needs to strip file mode's suid/sgid bit after ALLOCATE opDai Ngo2023-10-111-1/+2
| | | | | | | | | | | | | | | The Linux NFS server strips the SUID and SGID from the file mode on ALLOCATE op. Modify _nfs42_proc_fallocate to add NFS_INO_REVAL_FORCED to nfs_set_cache_invalid's argument to force update of the file mode suid/sgid bit. Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4.2: fix handling of COPY ERR_OFFLOAD_NO_REQOlga Kornievskaia2023-08-301-2/+3
| | | | | | | | | | If the client sent a synchronous copy and the server replied with ERR_OFFLOAD_NO_REQ indicating that it wants an asynchronous copy instead, the client should retry with asynchronous copy. Fixes: 539f57b3e0fd ("NFS handle COPY ERR_OFFLOAD_NO_REQS") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4.2: fix error handling in nfs42_proc_getxattrFedor Pchelkin2023-08-191-3/+2
| | | | | | | | | | | | | | There is a slight issue with error handling code inside nfs42_proc_getxattr(). If page allocating loop fails then we free the failing page array element which is NULL but __free_page() can't deal with NULL args. Found by Linux Verification Center (linuxtesting.org). Fixes: a1f26739ccdc ("NFSv4.2: improve page handling for GETXATTR") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2: SETXATTR should update ctimeAnna Schumaker2023-06-191-4/+21
| | | | | | | Otherwise, `stat` will report a stale value to users. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* nfs42: do not fail with EIO if ssc returns NFS4ERR_OFFLOAD_DENIEDTigran Mkrtchyan2023-02-151-1/+2
| | | | | | | | | | | | | | The NFSv4.2 server even if supports intra-SSC might prefer that for a particular file a classic copy is performed. As returning ENOTSUPP will clear the SSC capability of the server by the client, server might return NFS4ERR_OFFLOAD_DENIED (well, spec talks about remote servers there). Update nfs42_proc_copy to handle NFS4ERR_OFFLOAD_DENIED as ENOTSUPP, but without clearing NFS_CAP_COPY bit. Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4.2: Fixup CLONE dest file size for zero-length countBenjamin Coddington2022-10-271-0/+3
| | | | | | | | | | | | | When holding a delegation, the NFS client optimizes away setting the attributes of a file from the GETATTR in the compound after CLONE, and for a zero-length CLONE we will end up setting the inode's size to zero in nfs42_copy_dest_done(). Handle this case by computing the resulting count from the server's reported size after CLONE's GETATTR. Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 94d202d5ca39 ("NFSv42: Copy offload should update the file size when appropriate") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* Merge tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds2022-10-131-0/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates from Anna Schumaker: "New Features: - Add NFSv4.2 xattr tracepoints - Replace xprtiod WQ in rpcrdma - Flexfiles cancels I/O on layout recall or revoke Bugfixes and Cleanups: - Directly use ida_alloc() / ida_free() - Don't open-code max_t() - Prefer using strscpy over strlcpy - Remove unused forward declarations - Always return layout states on flexfiles layout return - Have LISTXATTR treat NFS4ERR_NOXATTR as an empty reply instead of error - Allow more xprtrdma memory allocations to fail without triggering a reclaim - Various other xprtrdma clean ups - Fix rpc_killall_tasks() races" * tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits) NFSv4/flexfiles: Cancel I/O if the layout is recalled or revoked SUNRPC: Add API to force the client to disconnect SUNRPC: Add a helper to allow pNFS drivers to selectively cancel RPC calls SUNRPC: Fix races with rpc_killall_tasks() xprtrdma: Fix uninitialized variable xprtrdma: Prevent memory allocations from driving a reclaim xprtrdma: Memory allocation should be allowed to fail during connect xprtrdma: MR-related memory allocation should be allowed to fail xprtrdma: Clean up synopsis of rpcrdma_regbuf_alloc() xprtrdma: Clean up synopsis of rpcrdma_req_create() svcrdma: Clean up RPCRDMA_DEF_GFP SUNRPC: Replace the use of the xprtiod WQ in rpcrdma NFSv4.2: Add a tracepoint for listxattr NFSv4.2: Add tracepoints for getxattr, setxattr, and removexattr NFSv4.2: Move TRACE_DEFINE_ENUM(NFS4_CONTENT_*) under CONFIG_NFS_V4_2 NFSv4.2: Add special handling for LISTXATTR receiving NFS4ERR_NOXATTR nfs: remove nfs_wait_atomic_killable() and nfs_write_prepare() declaration NFSv4: remove nfs4_renewd_prepare_shutdown() declaration fs/nfs/pnfs_nfs.c: fix spelling typo and syntax error in comment NFSv4/pNFS: Always return layout stats on layout return for flexfiles ...
| * NFSv4.2: Add a tracepoint for listxattrAnna Schumaker2022-10-051-0/+1
| | | | | | | | | | | | | | | | This can be defined as simply an NFS4_INODE_EVENT() since we don't have the name of a specific xattr to list. This roughly matches readdir, which also uses an NFS4_INODE_EVENT() tracepoint. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * NFSv4.2: Add tracepoints for getxattr, setxattr, and removexattrAnna Schumaker2022-10-051-0/+3
| | | | | | | | | | | | | | These functions take similar arguments, and can share a tracepoint class for common formatting. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* | Merge tag 'pull-file_inode' of ↵Linus Torvalds2022-10-071-1/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull file_inode() updates from Al Vrio: "whack-a-mole: cropped up open-coded file_inode() uses..." * tag 'pull-file_inode' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: orangefs: use ->f_mapping _nfs42_proc_copy(): use ->f_mapping instead of file_inode()->i_mapping dma_buf: no need to bother with file_inode()->i_mapping nfs_finish_open(): don't open-code file_inode() bprm_fill_uid(): don't open-code file_inode() sgx: use ->f_mapping... exfat_iterate(): don't open-code file_inode(file) ibmvmc: don't open-code file_inode()
| * _nfs42_proc_copy(): use ->f_mapping instead of file_inode()->i_mappingAl Viro2022-09-011-1/+1
| | | | | | | | | | Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | NFSv4.2: Update mode bits after ALLOCATE and DEALLOCATEAnna Schumaker2022-09-081-2/+7
|/ | | | | | | | | | | | | The fallocate call invalidates suid and sgid bits as part of normal operation. We need to mark the mode bits as invalid when using fallocate with an suid so these will be updated the next time the user looks at them. This fixes xfstests generic/683 and generic/684. Reported-by: Yue Cui <cuiyue-fnst@fujitsu.com> Fixes: 913eca1aea87 ("NFS: Fallocate should use the nfs4_fattr_bitmap") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFS: replace usage of found with dedicated list iterator variableJakob Koschel2022-03-241-7/+6
| | | | | | | | | | | | | | | | | To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use a found variable and simply checking if the variable was set, can determine if the break/goto was hit. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4/pnfs: Ensure pNFS allocation modes are consistent with nfsiodTrond Myklebust2022-03-221-1/+1
| | | | | | | | Ensure that pNFS allocations that can be called from rpciod/nfsiod callback can fail in low memory mode, so that the threads don't block and loop forever. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2/copyoffload: Convert GFP_NOFS to GFP_KERNELTrond Myklebust2022-02-261-5/+5
| | | | | | | There doesn't seem to be any reason why the copy offload code can't use GFP_KERNEL. It can't get called by direct reclaim. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4/flexfiles: Convert GFP_NOFS to GFP_KERNELTrond Myklebust2022-02-261-1/+1
| | | | | | | Assume that the higher layers will have set memalloc_nofs_save/restore as appropriate. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2: fix reference count leaks in _nfs42_proc_copy_notify()Xin Xiong2022-02-261-3/+6
| | | | | | | | | | | | | | | | | | [You don't often get email from xiongx18@fudan.edu.cn. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.] The reference counting issue happens in two error paths in the function _nfs42_proc_copy_notify(). In both error paths, the function simply returns the error code and forgets to balance the refcount of object `ctx`, bumped by get_nfs_open_context() earlier, which may cause refcount leaks. Fix it by balancing refcount of the `ctx` object before the function returns in both error paths. Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn> Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv42: Fallocate and clone should also request 'blocks used'Trond Myklebust2022-01-061-5/+8
| | | | | | | Both fallocate and clone can end up updating the blocks used attribute. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv42: Fix pagecache invalidation after COPY/CLONEBenjamin Coddington2021-11-171-1/+3
| | | | | | | | | | | | | | The mechanism in use to allow the client to see the results of COPY/CLONE is to drop those pages from the pagecache. This forces the client to read those pages once more from the server. However, truncate_pagecache_range() zeros out partial pages instead of dropping them. Let us instead use invalidate_inode_pages2_range() with full-page offsets to ensure the client properly sees the results of COPY/CLONE operations. Cc: <stable@vger.kernel.org> # v4.7+ Fixes: 2e72448b07dc ("NFS: Add COPY nfs operation") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2 add tracepoint to OFFLOAD_CANCELOlga Kornievskaia2021-11-051-0/+1
| | | | | | | Add tracepoint to OFFLOAD_CANCEL operation. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2 add tracepoint to COPY_NOTIFYOlga Kornievskaia2021-11-051-0/+1
| | | | | | | Add a tracepoint to COPY_NOTIFY operation. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2 add tracepoint to CLONEOlga Kornievskaia2021-11-051-0/+1
| | | | | | | Add a tracepoint to the CLONE operation. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2 add tracepoint to COPYOlga Kornievskaia2021-11-051-0/+1
| | | | | | | Add a tracepoint to the COPY operation. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2 add tracepoints to FALLOCATE and DEALLOCATEOlga Kornievskaia2021-11-051-0/+4
| | | | | | | Add a tracepoint to the FALLOCATE/DEALLOCATE operations. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2 add tracepoint to SEEKOlga Kornievskaia2021-11-051-0/+1
| | | | | | | Add a tracepoint to the SEEK operation. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv42: Don't force attribute revalidation of the copy offload sourceTrond Myklebust2021-04-141-6/+1
| | | | | | | When a copy offload is performed, we do not expect the source file to change other than perhaps to see the atime be updated. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv42: Copy offload should update the file size when appropriateTrond Myklebust2021-04-141-9/+32
| | | | | | | | | If the result of a copy offload or clone operation is to grow the destination file size, then we should update it. The reason is that when a client holds a delegation, it is authoritative for the file size. Fixes: 16abd2a0c124 ("NFSv4.2: fix client's attribute cache management for copy_file_range") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2 fix handling of sr_eof in SEEK's replyOlga Kornievskaia2021-04-141-1/+4
| | | | | | | | | | | | | Currently the client ignores the value of the sr_eof of the SEEK operation. According to the spec, if the server didn't find the requested extent and reached the end of the file, the server would return sr_eof=true. In case the request for DATA and no data was found (ie in the middle of the hole), then the lseek expects that ENXIO would be returned. Fixes: 1c6dcbe5ceff8 ("NFS: Implement SEEK") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2: Always flush out writes in nfs42_proc_fallocate()Trond Myklebust2021-04-121-7/+9
| | | | | | | | Whether we're allocating or delallocating space, we should flush out the pending writes in order to avoid races with attribute updates. Fixes: 1e564d3dbd68 ("NFSv4.2: Fix a race in nfs42_proc_deallocate()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFS: Fix attribute bitmask in _nfs42_proc_fallocate()Trond Myklebust2021-04-121-2/+8
| | | | | | | | | | We can't use nfs4_fattr_bitmap as a bitmask, because it hasn't been filtered to represent the attributes supported by the server. Instead, let's revert to using server->cache_consistency_bitmask after adding in the missing SPACE_USED attribute. Fixes: 913eca1aea87 ("NFS: Fallocate should use the nfs4_fattr_bitmap") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFS: Fix open coded versions of nfs_set_cache_invalid() in NFSv4Trond Myklebust2021-03-081-5/+7
| | | | | | | | nfs_set_cache_invalid() has code to handle delegations, and other optimisations, so let's use it when appropriate. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4.2: fix error return on memory allocation failureColin Ian King2020-12-161-0/+1
| | | | | | | | | | | Currently when an alloc_page fails the error return is not set in variable err and a garbage initialized value is returned. Fix this by setting err to -ENOMEM before taking the error return path. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: a1f26739ccdc ("NFSv4.2: improve page handling for GETXATTR") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2: improve page handling for GETXATTRFrank van der Linden2020-12-141-11/+36
| | | | | | | | | | | | | | XDRBUF_SPARSE_PAGES can cause problems for the RDMA transport, and it's easy enough to allocate enough pages for the request up front, so do that. Also, since we've allocated the pages anyway, use the full page aligned length for the receive buffer. This will allow caching of valid replies that are too large for the caller, but that still fit in the allocated pages. Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFS: Fix rpcrdma_inline_fixup() crash with new LISTXATTRS operationChuck Lever2020-12-021-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By switching to an XFS-backed export, I am able to reproduce the ibcomp worker crash on my client with xfstests generic/013. For the failing LISTXATTRS operation, xdr_inline_pages() is called with page_len=12 and buflen=128. - When ->send_request() is called, rpcrdma_marshal_req() does not set up a Reply chunk because buflen is smaller than the inline threshold. Thus rpcrdma_convert_iovs() does not get invoked at all and the transport's XDRBUF_SPARSE_PAGES logic is not invoked on the receive buffer. - During reply processing, rpcrdma_inline_fixup() tries to copy received data into rq_rcv_buf->pages because page_len is positive. But there are no receive pages because rpcrdma_marshal_req() never allocated them. The result is that the ibcomp worker faults and dies. Sometimes that causes a visible crash, and sometimes it results in a transport hang without other symptoms. RPC/RDMA's XDRBUF_SPARSE_PAGES support is not entirely correct, and should eventually be fixed or replaced. However, my preference is that upper-layer operations should explicitly allocate their receive buffers (using GFP_KERNEL) when possible, rather than relying on XDRBUF_SPARSE_PAGES. Reported-by: Olga kornievskaia <kolga@netapp.com> Suggested-by: Olga kornievskaia <kolga@netapp.com> Fixes: c10a75145feb ("NFSv4.2: add the extended attribute proc functions.") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Olga kornievskaia <kolga@netapp.com> Reviewed-by: Frank van der Linden <fllinden@amazon.com> Tested-by: Olga kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4.2: fix client's attribute cache management for copy_file_rangeOlga Kornievskaia2020-09-161-1/+9
| | | | | | | | | | | | | | | | | | | After client is done with the COPY operation, it needs to invalidate its pagecache (as it did no reading or writing of the data locally) and it needs to invalidate it's attributes just like it would have for a read on the source file and write on the destination file. Once the linux server started giving out read delegations to read+write opens, the destination file of the copy_file range started having delegations and not doing syncup on close of the file leading to xfstest failures for generic/430,431,432,433,565. v2: changing cache_validity needs to be protected by the i_lock. Reported-by: Murphy Zhou <jencce.kernel@gmail.com> Fixes: 2e72448b07dc ("NFS: Add COPY nfs operation") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFS: Add tracepoints for layouterror and layoutstats.Trond Myklebust2020-08-051-2/+8
| | | | | | Allow tracing of the NFSv4.2 layouterror and layoutstats operations. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2: add client side xattr caching.Frank van der Linden2020-07-131-0/+12
| | | | | | | | | | | | | | | | Implement client side caching for NFSv4.2 extended attributes. The cache is a per-inode hashtable, with name/value entries. There is one special entry for the listxattr cache. NFS inodes have a pointer to a cache structure. The cache structure is allocated on demand, freed when the cache is invalidated. Memory shrinkers keep the size in check. Large entries (> PAGE_SIZE) are collected by a separate shrinker, and freed more aggressively than others. Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSv4.2: add the extended attribute proc functions.Frank van der Linden2020-07-131-0/+236
| | | | | | | | Implement the extended attribute procedures for NFSv4.2 extended attribute support (RFC 8276). Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFS: Fix memory leaksWenwen Wang2020-02-041-2/+2
| | | | | | | | | | | | | | | | In _nfs42_proc_copy(), 'res->commit_res.verf' is allocated through kzalloc() if 'args->sync' is true. In the following code, if 'res->synchronous' is false, handle_async_copy() will be invoked. If an error occurs during the invocation, the following code will not be executed and the error will be returned . However, the allocated 'res->commit_res.verf' is not deallocated, leading to a memory leak. This is also true if the invocation of process_copy_commit() returns an error. To fix the above leaks, redirect the execution to the 'out' label if an error is encountered. Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4.x recover from pre-mature loss of openstateidOlga Kornievskaia2020-01-151-8/+28
| | | | | | | | | | | | | | | | Ever since the commit 0e0cb35b417f, it's possible to lose an open stateid while retrying a CLOSE due to ERR_OLD_STATEID. Once that happens, operations that require openstateid fail with EAGAIN which is propagated to the application then tests like generic/446 and generic/168 fail with "Resource temporarily unavailable". Instead of returning this error, initiate state recovery when possible to recover the open stateid and then try calling nfs4_select_rw_stateid() again. Fixes: 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: Make _nfs42_proc_copy_notify() staticYueHaibing2019-11-181-3/+3
| | | | | | | | | | | Fix sparse warning: fs/nfs/nfs42proc.c:527:5: warning: symbol '_nfs42_proc_copy_notify' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFS: Fallocate should use the nfs4_fattr_bitmapAnna Schumaker2019-11-181-1/+1
| | | | | | | | | | | | | | | | | Changing a sparse file could have an effect not only on the file size, but also on the number of blocks used by the file in the underlying filesystem. The server's cache_consistency_bitmap doesn't update the SPACE_USED attribute, so let's switch to the nfs4_fattr_bitmap to catch this update whenever we do an ALLOCATE or DEALLOCATE. This patch fixes xfstests generic/568, which tests that fallocating an unaligned range allocates all blocks touched by that range. Without this patch, `stat` reports 0 bytes used immediately after the fallocate. Adding a `sleep 5` to the test also catches the update, but it's better to do so when we know something has changed. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFS based on file size issue sync copy or fallback to generic copy offloadOlga Kornievskaia2019-10-091-2/+2
| | | | | | | | | For small file sizes, it make sense to issue a synchronous copy (and save an RPC callback operation). Also, for the inter copy offload, copy len must be larger than the cost of doing a mount between the destination and source server (14RPCs are sent during 4.x mount). Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
* NFS: handle source server rebootOlga Kornievskaia2019-10-091-21/+47
| | | | | | | | | When the source server reboots after a server-to-server copy was issued, we need to retry the copy from COPY_NOTIFY. We need to detect that the source server rebooted and there is a copy waiting on a destination server and wake it up. Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
* NFS: also send OFFLOAD_CANCEL to source serverOlga Kornievskaia2019-10-091-4/+8
| | | | | | | In case of copy is cancelled, also send OFFLOAD_CANCEL to the source server. Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
* NFS: COPY handle ERR_OFFLOAD_DENIEDOlga Kornievskaia2019-10-091-1/+2
| | | | | | | | If server sends ERR_OFFLOAD_DENIED error, the client must fall back on doing copy the normal way. Return ENOTSUPP to the vfs and fallback to regular copy. Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
* NFS: for "inter" copy treat ESTALE as ENOTSUPPOlga Kornievskaia2019-10-091-0/+5
| | | | | | | | | | | If the client sends an "inter" copy to the destination server but it only supports "intra" copy, it can return ESTALE (since it doesn't know anything about the file handle from the other server and does not recognize the special case of "inter" copy). Translate this error as ENOTSUPP and also send OFFLOAD_CANCEL to the source server so that it can clean up state. Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
* NFS: add ca_source_server<> to COPYOlga Kornievskaia2019-10-091-9/+17
| | | | | | | | Support only one source server address: the same address that the client and source server use. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
* NFS: add COPY_NOTIFY operationOlga Kornievskaia2019-10-091-0/+91
| | | | | | | | | | | | | | | Try using the delegation stateid, then the open stateid. Only NL4_NETATTR, No support for NL4_NAME and NL4_URL. Allow only one source server address to be returned for now. To distinguish between same server copy offload ("intra") and a copy between different server ("inter"), do a check of server owner identity and also make sure server is capable of doing a copy offload. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>