summaryrefslogtreecommitdiffstats
path: root/fs (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-linus' of ↵Linus Torvalds2016-03-2611-218/+419
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "There is quite a bit here, including some overdue refactoring and cleanup on the mon_client and osd_client code from Ilya, scattered writeback support for CephFS and a pile of bug fixes from Zheng, and a few random cleanups and fixes from others" [ I already decided not to pull this because of it having been rebased recently, but ended up changing my mind after all. Next time I'll really hold people to it. Oh well. - Linus ] * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (34 commits) libceph: use KMEM_CACHE macro ceph: use kmem_cache_zalloc rbd: use KMEM_CACHE macro ceph: use lookup request to revalidate dentry ceph: kill ceph_get_dentry_parent_inode() ceph: fix security xattr deadlock ceph: don't request vxattrs from MDS ceph: fix mounting same fs multiple times ceph: remove unnecessary NULL check ceph: avoid updating directory inode's i_size accidentally ceph: fix race during filling readdir cache libceph: use sizeof_footer() more ceph: kill ceph_empty_snapc ceph: fix a wrong comparison ceph: replace CURRENT_TIME by current_fs_time() ceph: scattered page writeback libceph: add helper that duplicates last extent operation libceph: enable large, variable-sized OSD requests libceph: osdc->req_mempool should be backed by a slab pool libceph: make r_request msg_size calculation clearer ...
| * ceph: use kmem_cache_zallocGeliang Tang2016-03-252-2/+2
| | | | | | | | | | | | | | Use kmem_cache_zalloc() instead of kmem_cache_alloc() with flag GFP_ZERO. Signed-off-by: Geliang Tang <geliangtang@163.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * ceph: use lookup request to revalidate dentryYan, Zheng2016-03-252-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If dentry has no lease, ceph_d_revalidate() previously return 0. This causes VFS to invalidate the dentry and create a new dentry for later lookup. Invalidating a dentry also detach any underneath mount points. So mount point inside cephfs can disapear mystically (even the mount point is not modified by other hosts). The fix is using lookup request to revalidate dentry without lease. This can partly solve the mount points disapear issue (as long as the mount point is not modified by other hosts) Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: kill ceph_get_dentry_parent_inode()Yan, Zheng2016-03-252-20/+5
| | | | | | | | | | | | use vfs helper dget_parent() instead Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: fix security xattr deadlockYan, Zheng2016-03-257-10/+123
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When security is enabled, security module can call filesystem's getxattr/setxattr callbacks during d_instantiate(). For cephfs, d_instantiate() is usually called by MDS' dispatch thread, while handling MDS reply. If the MDS reply does not include xattrs and corresponding caps, getxattr/setxattr need to send a new request to MDS and waits for the reply. This makes MDS' dispatch sleep, nobody handles later MDS replies. The fix is make sure lookup/atomic_open reply include xattrs and corresponding caps. So getxattr can be handled by cached xattrs. This requires some modification to both MDS and request message. (Client tells MDS what caps it wants; MDS encodes proper caps in the reply) Smack security module may call setxattr during d_instantiate(). Unlike getxattr, we can't force MDS to issue CEPH_CAP_XATTR_EXCL to us. So just make setxattr return error when called by MDS' dispatch thread. Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: don't request vxattrs from MDSYan, Zheng2016-03-251-2/+4
| | | | | | | | | | | | It's uselese because MDS reply does not carry any vxattr. Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: fix mounting same fs multiple timesYan, Zheng2016-03-251-18/+15
| | | | | | | | | | | | | | Now __ceph_open_session() only accepts closed client. An opened client will tigger BUG_ON(). Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: remove unnecessary NULL checkYan, Zheng2016-03-251-2/+2
| | | | | | | | | | | | | | | | If page->mapping is NULL, releasepage() callback does not get called. Remove the unnecessary NULL check to make static code analysis tool happy Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: avoid updating directory inode's i_size accidentallyYan, Zheng2016-03-251-0/+4
| | | | | | | | | | | | Directory inode's i_size is used by readdir cache. Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: fix race during filling readdir cacheYan, Zheng2016-03-251-2/+7
| | | | | | | | | | | | | | | | | | Readdir cache uses page cache to save dentry pointers. When adding dentry pointers to middle of a page, we need to make sure the page already exists. Otherwise the beginning part of the page will be invalid pointers. Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: kill ceph_empty_snapcIlya Dryomov2016-03-254-34/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ceph_empty_snapc->num_snaps == 0 at all times. Passing such a snapc to ceph_osdc_alloc_request() (possibly through ceph_osdc_new_request()) is equivalent to passing NULL, as ceph_osdc_alloc_request() uses it only for sizing the request message. Further, in all four cases the subsequent ceph_osdc_build_request() is passed NULL for snapc, meaning that 0 is encoded for seq and num_snaps and making ceph_empty_snapc entirely useless. The two cases where it actually mattered were removed in commits 860560904962 ("ceph: avoid sending unnessesary FLUSHSNAP message") and 23078637e054 ("ceph: fix queuing inode to mdsdir's snaprealm"). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Yan, Zheng <zyan@redhat.com>
| * ceph: fix a wrong comparisonAnton Protopopov2016-03-251-1/+1
| | | | | | | | | | | | | | | | A negative value rc compared to the positive value ENOENT in the finish_read() function. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: replace CURRENT_TIME by current_fs_time()Deepa Dinamani2016-03-254-6/+6
| | | | | | | | | | | | | | | | | | CURRENT_TIME macro is not appropriate for filesystems as it doesn't use the right granularity for filesystem timestamps. Use current_fs_time() instead. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: scattered page writebackYan, Zheng2016-03-251-109/+196
| | | | | | | | | | | | | | | | | | | | This patch makes ceph_writepages_start() try using single OSD request to write all dirty pages within a strip unit. When a nonconsecutive dirty page is found, ceph_writepages_start() tries starting a new write operation to existing OSD request. If it succeeds, it uses the new operation to writeback the dirty page. Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: remove useless BUG_ONYan, Zheng2016-03-251-2/+0
| | | | | | | | | | | | ceph_osdc_start_request() never return -EOLDSNAP Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: don't enable rbytes mount option by defaultYan, Zheng2016-03-252-4/+3
| | | | | | | | | | | | | | | | When rbytes mount option is enabled, directory size is recursive size. Recursive size is not updated instantly. This can cause directory size to change between successive stat(1) Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: encode ctime in cap messageYan, Zheng2016-03-251-4/+7
| | | | | | | | Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * libceph: revamp subs code, switch to SUBSCRIBE2 protocolIlya Dryomov2016-03-252-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is currently hard-coded in the mon_client that mdsmap and monmap subs are continuous, while osdmap sub is always "onetime". To better handle full clusters/pools in the osd_client, we need to be able to issue continuous osdmap subs. Revamp subs code to allow us to specify for each sub whether it should be continuous or not. Although not strictly required for the above, switch to SUBSCRIBE2 protocol while at it, eliminating the ambiguity between a request for "every map since X" and a request for "just the latest" when we don't have a map yet (i.e. have epoch 0). SUBSCRIBE2 feature bit is now required - it's been supported since pre-argonaut (2010). Move "got mdsmap" call to the end of ceph_mdsc_handle_map() - calling in before we validate the epoch and successfully install the new map can mess up mon_client sub state. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
* | Merge tag 'ofs-pull-tag-1' of ↵Linus Torvalds2016-03-2630-0/+10742
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux Pull orangefs filesystem from Mike Marshall. This finally merges the long-pending orangefs filesystem, which has been much cleaned up with input from Al Viro over the last six months. From the documentation file: "OrangeFS is an LGPL userspace scale-out parallel storage system. It is ideal for large storage problems faced by HPC, BigData, Streaming Video, Genomics, Bioinformatics. Orangefs, originally called PVFS, was first developed in 1993 by Walt Ligon and Eric Blumer as a parallel file system for Parallel Virtual Machine (PVM) as part of a NASA grant to study the I/O patterns of parallel programs. Orangefs features include: - Distributes file data among multiple file servers - Supports simultaneous access by multiple clients - Stores file data and metadata on servers using local file system and access methods - Userspace implementation is easy to install and maintain - Direct MPI support - Stateless" see Documentation/filesystems/orangefs.txt for more in-depth details. * tag 'ofs-pull-tag-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: (174 commits) orangefs: fix orangefs_superblock locking orangefs: fix do_readv_writev() handling of error halfway through orangefs: have ->kill_sb() evict the VFS side of things first orangefs: sanitize ->llseek() orangefs-bufmap.h: trim unused junk orangefs: saner calling conventions for getting a slot orangefs_copy_{to,from}_bufmap(): don't pass bufmap pointer orangefs: get rid of readdir_handle_s ornagefs: ensure that truncate has an up to date inode size orangefs: move code which sets i_link to orangefs_inode_getattr orangefs: remove needless wrapper around GFP_KERNEL orangefs: remove wrapper around mutex_lock(&inode->i_mutex) orangefs: refactor inode type or link_target change detection orangefs: use new getattr for revalidate and remove old getattr orangefs: use new getattr in inode getattr and permission orangefs: use new orangefs_inode_getattr to get size in write and llseek orangefs: use new orangefs_inode_getattr to create new inodes orangefs: rename orangefs_inode_getattr to orangefs_inode_old_getattr orangefs: remove inode->i_lock wrapper orangefs: put register_chrdev immediately before register_filesystem ...
| * | orangefs: fix orangefs_superblock lockingAl Viro2016-03-263-58/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * switch orangefs_remount() to taking ORANGEFS_SB(sb) instead of sb * remove from the list _before_ orangefs_unmount() - request_mutex in the latter will make sure that nothing observed in the loop in ORANGEFS_DEV_REMOUNT_ALL handling will get freed until the end of loop * on removal, keep the forward pointer and zero the back one. That way we can drop and regain the spinlock in the loop body (again, ORANGEFS_DEV_REMOUNT_ALL one) and still be able to get to the rest of the list. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: fix do_readv_writev() handling of error halfway throughAl Viro2016-03-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Error should only be returned if nothing had been read/written. Otherwise we need to report a short read/write instead. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: have ->kill_sb() evict the VFS side of things firstAl Viro2016-03-261-3/+3
| | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: sanitize ->llseek()Al Viro2016-03-262-10/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | a) open files can't have NULL inodes b) it's SEEK_END, not ORANGEFS_SEEK_END; no need to get cute. c) make_bad_inode() on lseek()? Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs-bufmap.h: trim unused junkAl Viro2016-03-261-9/+0
| | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: saner calling conventions for getting a slotAl Viro2016-03-264-28/+16
| | | | | | | | | | | | | | | | | | | | | | | | just have it return the slot number or -E... - the caller checks the sign anyway Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs_copy_{to,from}_bufmap(): don't pass bufmap pointerAl Viro2016-03-263-23/+14
| | | | | | | | | | | | | | | | | | | | | it's always __orangefs_bufmap Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: get rid of readdir_handle_sAl Viro2016-03-261-63/+30
| | | | | | | | | | | | | | | | | | | | | | | | no point, really - we couldn't keep those across the calls of getdents(); it would be too easy to DoS, having all slots exhausted. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | ornagefs: ensure that truncate has an up to date inode sizeMartin Brandenburg2016-03-231-1/+12
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: move code which sets i_link to orangefs_inode_getattrMartin Brandenburg2016-03-232-2/+1
| | | | | | | | | | | | | | | | | | | | | Everything else setting inode->i_ values is in there. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: remove needless wrapper around GFP_KERNELMartin Brandenburg2016-03-232-5/+1
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: remove wrapper around mutex_lock(&inode->i_mutex)Martin Brandenburg2016-03-231-6/+2
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: refactor inode type or link_target change detectionMartin Brandenburg2016-03-231-41/+36
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: use new getattr for revalidate and remove old getattrMartin Brandenburg2016-03-233-325/+49
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: use new getattr in inode getattr and permissionMartin Brandenburg2016-03-231-12/+2
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: use new orangefs_inode_getattr to get size in write and llseekMartin Brandenburg2016-03-231-6/+8
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: use new orangefs_inode_getattr to create new inodesMartin Brandenburg2016-03-231-4/+2
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: rename orangefs_inode_getattr to orangefs_inode_old_getattrMartin Brandenburg2016-03-235-10/+133
| | | | | | | | | | | | | | | | | | | | | | | | This is motivated by orangefs_inode_old_getattr's habit of writing over live inodes. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: remove inode->i_lock wrapperMartin Brandenburg2016-03-232-7/+4
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: put register_chrdev immediately before register_filesystemMartin Brandenburg2016-03-171-13/+13
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: remove paranoia in orangefs_set_inodeMartin Brandenburg2016-03-171-10/+2
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: sanitize listxattr and return EIO on impossible valuesMartin Brandenburg2016-03-171-0/+10
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: remove unused reference to xattr key lengthMartin Brandenburg2016-03-171-5/+0
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | Orangefs: adjust unwind on module init failure.Mike Marshall2016-03-171-4/+3
| | | | | | | | | | | | Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | Orangefs: fix sloppy cleanups of debugfs and sysfs init failures.Mike Marshall2016-03-143-62/+76
| | | | | | | | | | | | Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | Orangefs: follow_link -> get_link changeMike Marshall2016-03-142-19/+4
| | | | | | | | | | | | Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | Orangefs: Extra sanity insurance on buffer before using string functions on it.Mike Marshall2016-03-141-0/+13
| | | | | | | | | | | | Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | Orangefs: merge to v4.5Mike Marshall2016-03-14547-7969/+15929
| |\ \ | | | | | | | | | | | | | | | | | | | | Merge tag 'v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into current Linux 4.5
| * | | orangefs: make fs_mount_pending staticMartin Brandenburg2016-03-092-39/+38
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | | orangefs: Avoid symlink upcall if target is too long.Martin Brandenburg2016-03-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the client-core detected this condition by sheer luck! Since we used strncpy, no NUL byte would be included on the name. The client-core would call strlen, which would read past the end of its buffer, but return a number large enough that the client-core would return ENAMETOOLONG. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | | Orangefs: improve the POSIXness of interrupted writes...Mike Marshall2016-03-091-9/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't return EINTR on interrupted writes if some data has already been written. Signed-off-by: Mike Marshall <hubcap@omnibond.com>