| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update some fs code to make use of new helper functions introduced
in the previous patch. Should be no significant change in behaviour
(except CIFS now calls send_sig under i_lock, via inode_newsize_ok).
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: linux-nfs@vger.kernel.org
Cc: Trond.Myklebust@netapp.com
Cc: linux-cifs-client@lists.samba.org
Cc: sfrench@samba.org
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
| |
Introduce new truncate helpers truncate_pagecache and inode_newsize_ok.
vmtruncate is also consolidated from mm/memory.c and mm/nommu.c and
into mm/truncate.c.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most call sites of unload_nls() do:
if (nls)
unload_nls(nls);
Check the pointer inside unload_nls() like we do in kfree() and
simplify the call sites.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Steve French <sfrench@us.ibm.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Petr Vandrovec <vandrove@vc.cvut.cz>
Cc: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we held s_umount while a filesystem is frozen, despite that we
might return to userspace and unlock it from a different process. Instead
grab an active reference to keep the file system busy and add an explicit
check for frozen filesystems in remount and reject the remount instead
of blocking on s_umount.
Add a new get_active_super helper to super.c for use by freeze_bdev that
grabs an active reference to a superblock from a given block device.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we have the freeze count there is not much reason for bd_mount_sem
anymore. The actual freeze/thaw operations are serialized using the
bd_fsfreeze_mutex, and the only other place we take bd_mount_sem is
get_sb_bdev which tries to prevent mounting a filesystem while the block
device is frozen. Instead of add a check for bd_fsfreeze_count and
return -EBUSY if a filesystem is frozen. While that is a change in user
visible behaviour a failing mount is much better for this case rather
than having the mount process stuck uninterruptible for a long time.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the two places inside exofs that where taking the BKL were:
exofs_put_super() - .put_super
and
exofs_sync_fs() - which is .sync_fs and is also called from
.write_super.
Now exofs_sync_fs() is protected from itself by also taking
the sb_lock.
exofs_put_super() directly calls exofs_sync_fs() so there is no
danger between these two either.
In anyway there is absolutely nothing dangerous been done
inside exofs_sync_fs().
Unless there is some subtle race with the actual lifetime of
the super_block in regard to .put_super and some other parts
of the VFS. Which is highly unlikely.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
| |
romfs_fill_super() assumes that romfs_iget() returns NULL when
it fails. romfs_iget() actually returns ERR_PTR(-ve) in that
case...
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add two helpers that allow access to the seq_file's own buffer, but
hide the internal details of seq_files.
This allows easier implementation of special purpose filling
functions. It also cleans up some existing functions which duplicated
the seq_file logic.
Make these inline functions in seq_file.h, as suggested by Al.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As Johannes Weiner pointed out, one of the range checks in do_sendfile
is redundant and is already checked in rw_verify_area.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Robert Love <rlove@google.com>
Cc: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sb->s_maxbytes is supposed to indicate the maximum size of a file that can
exist on the filesystem. It's declared as an unsigned long long.
Even if a filesystem has no inherent limit that prevents it from using
every bit in that unsigned long long, it's still problematic to set it to
anything larger than MAX_LFS_FILESIZE. There are places in the kernel
that cast s_maxbytes to a signed value. If it's set too large then this
cast makes it a negative number and generally breaks the comparison.
Change s_maxbytes to be loff_t instead. That should help eliminate the
temptation to set it too large by making it a signed value.
Also, add a warning for couple of releases to help catch filesystems that
set s_maxbytes too large. Eventually we can either convert this to a
BUG() or just remove it and in the hope that no one will get it wrong now
that it's a signed value.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Robert Love <rlove@google.com>
Cc: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If fiemap_check_ranges is passed a large enough value, then it's
possible that the value would be cast to a signed value for comparison
against s_maxbytes when we change it to loff_t. Make sure that doesn't
happen by explicitly casting s_maxbytes to an unsigned value for the
purposes of comparison.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robert Love <rlove@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently all simple_attr.set handlers return 0 on success and negative
codes on error. Fix simple_attr_write() to return these error codes.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
seq_path_root() is returning a return value of successful __d_path()
instead of returning a negative value when mangle_path() failed.
This is not a bug so far because nobody is using return value of
seq_path_root().
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do a similar optimization as earlier for touch_atime. Getting the lock in
mnt_get_write is relatively costly, so try all avenues to avoid it first.
This patch is careful to still only update inode fields inside the lock
region.
This didn't show up in benchmarks, but it's easy enough to do.
[akpm@linux-foundation.org: fix typo in comment]
[hugh.dickins@tiscali.co.uk: fix inverted test of mnt_want_write_file()]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Valerie Aurora <vaurora@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some benchmark testing shows touch_atime to be high up in profile logs for
IO intensive workloads. Most likely that's due to the lock in
mnt_want_write(). Unfortunately touch_atime first takes the lock, and
then does all the other tests that could avoid atime updates (like noatime
or relatime).
Do it the other way round -- first try to avoid the update and only then
if that didn't succeed take the lock. That works because none of the
atime avoidance tests rely on locking.
This also eliminates a goto.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Valerie Aurora <vaurora@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Hugetlbfs needs to do special things instead of truncate_inode_pages().
Currently, it copied generic_forget_inode() except for
truncate_inode_pages() call which is asking for trouble (the code there
isn't trivial). So create a separate function generic_detach_inode()
which does all the list magic done in generic_forget_inode() and call
it from hugetlbfs_forget_inode().
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add device-id and inode number for better debugging. This was suggested
by Andreas in one of the threads
http://article.gmane.org/gmane.comp.file-systems.ext4/12062 .
"If anyone has a chance, fixing this error message to be not-useless would
be good... Including the device name and the inode number would help
track down the source of the problem."
Signed-off-by: Manish Katiyar <mkatiyar@gmail.com>
Cc: Andreas Dilger <adilger@sun.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Impact: have simple_read_from_buffer conform to standards
It was brought to my attention by Andrew Morton, Theodore Tso, and H.
Peter Anvin that a read from userspace should only return -EFAULT if
nothing was actually read.
Looking at the simple_read_from_buffer I noticed that this function does
not conform to that rule. This patch fixes that function.
[akpm@linux-foundation.org: simplification suggested by hpa]
[hpa@zytor.com: fix count==0 handling]
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* remove asm/atomic.h inclusion from linux/utsname.h --
not needed after kref conversion
* remove linux/utsname.h inclusion from files which do not need it
NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
due to some personality stuff it _is_ needed -- cowardly leave ELF-related
headers and files alone.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Propagate 'fsc' mount option through automounts
sunrpc/rpc_pipe: fix kernel-doc notation
sunrpc: xdr_xcode_hyper helpers cannot presume 64-bit alignment
NFS: Add nfs_alloc_parsed_mount_data
NFS/RPC: fix problems with reestablish_timeout and related code.
NFS: Get rid of the NFS_MOUNT_VER3 and NFS_MOUNT_TCP flags
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Propagate the NFS 'fsc' mount option through NFS automounts of various types.
This is now required as commit:
commit c02d7adf8c5429727a98bad1d039bccad4c61c50
Author: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Mon Jun 22 15:09:14 2009 -0400
NFSv4: Replace nfs4_path_walk() with VFS path lookup in a private namespace
uses VFS-driven automounting to reach all submounts barring the root, thus
preventing fscaching from being enabled on any submount other than the root.
This patch gets around that by propagating the NFS_OPTION_FSCACHE flag across
automounts. If a uniquifier is supplied to a mount then this is propagated to
all automounts of that mount too.
Signed-off-by: David Howells <dhowells@redhat.com>
[Trond: Fixed up the definition of nfs_fscache_get_super_cookie for the
case of #undef CONFIG_NFS_FSCACHE]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Allocating nfs_parsed_mount_data and setting up the defaults is nearly
the same for both nfs and nfs4 mounts.
Both paths seem to use nfs_validate_transport_protocol(), so setting a
default value for nfs_server.protocol ought to be unnecessary.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
| |
| |
| |
| |
| |
| |
| | |
Keep it in the case of the legacy binary mount interface, but purge it from
the nfs_server structure.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch adds a persistent, read-only caching facility for
9p clients using the FS-Cache caching backend.
When the fscache facility is enabled, each inode is associated
with a corresponding vcookie which is an index into the FS-Cache
indexing tree. The FS-Cache indexing tree is indexed at 3 levels:
- session object associated with each mount.
- inode/vcookie
- actual data (pages)
A cache tag is chosen randomly for each session. These tags can
be read off /sys/fs/9p/caches and can be passed as a mount-time
parameter to re-attach to the specified caching session.
Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When using the cache=loose flags, the inode's size was not being
updated correctly on a remote write. Thus subsequent reads of
the whole file resulted in a truncated read. Fix it.
Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
|
|/
|
|
|
|
|
|
|
|
|
| |
directly.
Change all occurrence of inode->i_size with i_size_read() or i_size_write()
as appropriate.
Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* git://git.infradead.org/mtd-2.6: (58 commits)
mtd: jedec_probe: add PSD4256G6V id
mtd: OneNand support for Nomadik 8815 SoC (on NHK8815 board)
mtd: nand: driver for Nomadik 8815 SoC (on NHK8815 board)
m25p80: Add Spansion S25FL129P serial flashes
jffs2: Use SLAB_HWCACHE_ALIGN for jffs2_raw_{dirent,inode} slabs
mtd: sh_flctl: register sh_flctl using platform_driver_probe()
mtd: nand: txx9ndfmc: transfer 512 byte at a time if possible
mtd: nand: fix tmio_nand ecc correction
mtd: nand: add __nand_correct_data helper function
mtd: cfi_cmdset_0002: add 0xFF intolerance for M29W128G
mtd: inftl: fix fold chain block number
mtd: jedec: fix compilation problem with I28F640C3B definition
mtd: nand: fix ECC Correction bug for SMC ordering for NDFC driver
mtd: ofpart: Check availability of reg property instead of name property
driver/Makefile: Initialize "mtd" and "spi" before "net"
mtd: omap: adding DMA mode support in nand prefetch/post-write
mtd: omap: add support for nand prefetch-read and post-write
mtd: add nand support for w90p910 (v2)
mtd: maps: add mtd-ram support to physmap_of
mtd: pxa3xx_nand: add single-bit error corrections reporting
...
|
| |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:
drivers/mtd/mtdcore.c
Merged in order that I can apply the Nomadik nand/onenand support patches.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We may end up doing DMA to/from these. Until the new MTD API fixes the
issues, this should stop things from falling over.
Original idea from Gilles Casse <list@gcasse.net>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Move the jffs2 garbage collecting thread to the new kthread API.
Signed-off-by: Gerard Lledo <gerard.lledo@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (85 commits)
ocfs2: Use buffer IO if we are appending a file.
ocfs2: add spinlock protection when dealing with lockres->purge.
dlmglue.c: add missed mlog lines
ocfs2: __ocfs2_abort() should not enable panic for local mounts
ocfs2: Add ioctl for reflink.
ocfs2: Enable refcount tree support.
ocfs2: Implement ocfs2_reflink.
ocfs2: Add preserve to reflink.
ocfs2: Create reflinked file in orphan dir.
ocfs2: Use proper parameter for some inode operation.
ocfs2: Make transaction extend more efficient.
ocfs2: Don't merge in 1st refcount ops of reflink.
ocfs2: Modify removing xattr process for refcount.
ocfs2: Add reflink support for xattr.
ocfs2: Create an xattr indexed block if needed.
ocfs2: Call refcount tree remove process properly.
ocfs2: Attach xattr clusters to refcount tree.
ocfs2: Abstract ocfs2 xattr tree extend rec iteration process.
ocfs2: Abstract the creation of xattr block.
ocfs2: Remove inode from ocfs2_xattr_bucket_get_name_value.
...
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In ocfs2_file_aio_write, we will prevent direct io if
we find that we are appending(changing i_size) and call
generic_file_aio_write_nolock. But actually O_DIRECT flag
is there and this function will call generic_file_direct_write
eventually which will update i_size and leave di->i_size
alone. The bug is
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1173.
So this patch let ocfs2_direct_IO returns 0 directly if we
are appending so that buffered write will be called and
di->i_size get updated successfully. And this is also
what we want in ocfs2_file_aio_write.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
when we check/modify lockres->purge, we should with the protection of lockres->spinlock.
in dlm_purge_lockres(), the checking/modifying is not with the protectin.
this patch fixes it.
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This patch adds the missed mlog_exit() and mlog_exit_void() lines when routines
return.
Signed-off-by: Coly Li <coly.li@suse.de>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In a clustered setup, we have to panic the box on journal abort. This is
because we don't have the facility to go hard readonly. With hard ro, another
node would detect node failure and initiate recovery.
Having said that, we shouldn't force panic if the volume is mounted locally.
This patch defers the handling to the mount option, errors.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The ioctl will take 3 parameters: old_path, new_path and
preserve and call vfs_reflink. It is useful when we backport
reflink features to old kernels.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Implement ocfs2_reflink.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
reflink has 2 options for the destination file:
1. snapshot: reflink will attempt to preserve ownership, permissions,
and all other security state in order to create a full snapshot.
2. new file: it will acquire the data extent sharing but will see the
file's security state and attributes initialized as a new file.
So add the option to ocfs2.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
reflink is a very complicated process, so it can't be integrated
into one transaction. So if the system panic in the operation, we
may leave a unfinished inode in the destication directory.
So we will try to create an inode in orphan_dir first, reflink it
to the src file and then move it to the destication file in the end.
In that way we won't be afraid of any corruption during the reflink.
This patch adds 2 functions for orphan_dir operation:
1. Create a new inode in orphand dir.
2. Move an inode to a target dir.
Note:
fsck.ocfs2 should work for us to remove the unfinished file in the
orphan_dir.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In order to make the original function more suitable for reflink,
we modify the following inode operations. Both are tiny.
1. ocfs2_mknod_locked only use dentry for mlog, so move it to
the caller so that reflink can use it without dentry.
2. ocfs2_prepare_orphan_dir only want inode to get its ip_blkno.
So use ip_blkno instead.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In ocfs2_extend_rotate_transaction, op_credits is the orignal
credits in the handle and we only want to extend the credits
for the rotation, but the old solution always double it. It
is harmless for some minor operations, but for actions like
reflink we may rotate tree many times and cause the credits
increase dramatically. So this patch try to only increase
the desired credits.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Actually the whole reflink will touch refcount tree 2 times:
1. It will add the clusters in the extent record to the tree if it
isn't refcounted before.
2. It will add 1 refcount to these clusters when it add these
extent records to the tree.
So actually we shouldn't do merge in the 1st operation since the 2nd
one will soon be called and we may have to split it again. Do a merge
first and split soon is a waste of time. So we only merge in the 2nd
round. This is done by adding a new internal __ocfs2_increase_refcount
and call it with "not-merge" for 1st refcount operation in reflink.
This also has a side-effect that we don't need to worry too much about
the metadata allocation in the 2nd round since it will only merge and
no split will happen for those records.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The old xattr value remove is quite simple, it just erase the
tree and free the clusters. But as we have added refcount support,
The process is a little complicated.
We have to lock the refcount tree at the beginning, what's more,
we may split the refcount tree in some cases, so meta/credits are
needed.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
With reflink, there is a need that we create a new xattr indexed
block from the very beginning. So add a new parameter for
ocfs2_create_xattr_block.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Now with xattr refcount support, we need to check whether
we have xattr refcounted before we remove the refcount tree.
Now the mechanism is:
1) Check whether i_clusters == 0, if no, exit.
2) check whether we have i_xattr_loc in dinode. if yes, exit.
2) Check whether we have inline xattr stored outside, if yes, exit.
4) Remove the tree.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In ocfs2, when xattr's value is larger than OCFS2_XATTR_INLINE_SIZE,
it will be kept outside of the blocks we store xattr entry. And they
are stored in a b-tree also. So this patch try to attach all these
clusters to refcount tree also.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently we have ocfs2_iterate_xattr_buckets which can receive
a para and a callback to iterate a series of bucket. It is good.
But actually the 2 callers ocfs2_xattr_tree_list_index_block and
ocfs2_delete_xattr_index_block are almost the same. The only
difference is that the latter need to handle the extent record
also. So add a new function named ocfs2_iterate_xattr_index_block.
It can be given func callback which are used for exten record.
So now we only have one iteration function for the xattr index
block. Ane what's more, it is useful for our future reflink
operations.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
|