summaryrefslogtreecommitdiffstats
path: root/fs (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'nfsd-next' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2013-11-165-77/+115
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd changes from Bruce Fields: "This includes miscellaneous bugfixes and cleanup and a performance fix for write-heavy NFSv4 workloads. (The most significant nfsd-relevant change this time is actually in the delegation patches that went through Viro, fixing a long-standing bug that can cause NFSv4 clients to miss updates made by non-nfs users of the filesystem. Those enable some followup nfsd patches which I have queued locally, but those can wait till 3.14)" * 'nfsd-next' of git://linux-nfs.org/~bfields/linux: (24 commits) nfsd: export proper maximum file size to the client nfsd4: improve write performance with better sendspace reservations svcrpc: remove an unnecessary assignment sunrpc: comment typo fix Revert "nfsd: remove_stid can be incorporated into nfs4_put_delegation" nfsd4: fix discarded security labels on setattr NFSD: Add support for NFS v4.2 operation checking nfsd4: nfsd_shutdown_net needs state lock NFSD: Combine decode operations for v4 and v4.1 nfsd: -EINVAL on invalid anonuid/gid instead of silent failure nfsd: return better errors to exportfs nfsd: fh_update should error out in unexpected cases nfsd4: need to destroy revoked delegations in destroy_client nfsd: no need to unhash_stid before free nfsd: remove_stid can be incorporated into nfs4_put_delegation nfsd: nfs4_open_delegation needs to remove_stid rather than unhash_stid nfsd: nfs4_free_stid nfsd: fix Kconfig syntax sunrpc: trim off EC bytes in GSSAPI v2 unwrap gss_krb5: document that we ignore sequence number ...
| * nfsd: export proper maximum file size to the clientChristoph Hellwig2013-11-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | I noticed that we export a way to high value for the maxfilesize attribute when debugging a client issue. The issue didn't turn out to be related to it, but I think we should export it, so that clients can limit what write sizes they accept before hitting the server. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: improve write performance with better sendspace reservationsJ. Bruce Fields2013-11-131-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the rpc code conservatively refuses to accept rpc's from a client if the sum of its worst-case estimates of the replies it owes that client exceed the send buffer space. Unfortunately our estimate of the worst-case reply for an NFSv4 compound is always the maximum read size. This can unnecessarily limit the number of operations we handle concurrently, for example in the case most operations are writes (which have small replies). We can do a little better if we check which ops the compound contains. This is still a rough estimate, we'll need to improve on it some day. Reported-by: Shyam Kaushik <shyamnfs1@gmail.com> Tested-by: Shyam Kaushik <shyamnfs1@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * Revert "nfsd: remove_stid can be incorporated into nfs4_put_delegation"J. Bruce Fields2013-11-041-1/+3
| | | | | | | | | | | | | | | | | | This reverts commit 7ebe40f20372688a627ad6c754bc0d1c05df58a9. We forgot the nfs4_put_delegation call in fs/nfsd/nfs4callback.c which should not be unhashing the stateid. This lead to warnings from the idr code when we tried to removed id's twice. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: fix discarded security labels on setattrJ. Bruce Fields2013-11-011-0/+1
| | | | | | | | | | | | | | | | | | Security labels in setattr calls are currently ignored because we forget to set label->len. Cc: stable@vger.kernel.org Reported-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * NFSD: Add support for NFS v4.2 operation checkingAnna Schumaker2013-10-301-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | The server does allow NFS over v4.2, even if it doesn't add any new operations yet. I also switch to using constants to represent the last operation for each minor version since this makes the code cleaner and easier to understand at a quick glance. Signed-off-by: Anna Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: nfsd_shutdown_net needs state lockJ. Bruce Fields2013-10-301-1/+2
| | | | | | | | | | | | | | | | | | A comment claims the caller should take it, but that's not being done. Note we don't want it around the cancel_delayed_work_sync since that may wait on work which holds the client lock. Reported-by: Benny Halevy <bhalevy@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * NFSD: Combine decode operations for v4 and v4.1Anna Schumaker2013-10-301-58/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | We were using a different array of function pointers to represent each minor version. This makes adding a new minor version tedious, since it needs a step to copy, paste and modify a new version of the same functions. This patch combines the v4 and v4.1 arrays into a single instance and will check minor version support inside each decoder function. Signed-off-by: Anna Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd: -EINVAL on invalid anonuid/gid instead of silent failureJ. Bruce Fields2013-10-291-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we're going to refuse to accept these it would be polite of us to at least say so.... This introduces a slight complication since we need to grandfather in exportfs's ill-advised use of -1 uid and gid on its test_export. If it turns out there are other users passing down -1 we may need to do something else. Best might be to drop the checks entirely, but I'm not sure if other parts of the kernel might assume that a task can't run as uid or gid -1. Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd: return better errors to exportfsJ. Bruce Fields2013-10-291-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Someone noticed exportfs happily accepted exports that would later be rejected when mountd tried to give them to the kernel. Fix this. This is a regression from 4c1e1b34d5c800ad3ac9a7e2805b0bea70ad2278 "nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids". Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: stable@vger.kernel.org Reported-by: Yin.JianHong <jiyin@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd: fh_update should error out in unexpected casesJ. Bruce Fields2013-10-291-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reporter saw a NULL dereference when a filesystem's ->mknod returned success but left the dentry negative, and then nfsd tried to dereference d_inode (in this case because the CREATE was followed by a GETATTR in the same nfsv4 compound). fh_update already checks for this and another broken case, but for some reason it returns success and leaves nfsd trying to soldier on. If it failed we'd avoid the crash. There's only so much we can do with a buggy filesystem, but it's easy enough to bail out here, so let's do that. Reported-by: Antti Tönkyrä <daedalus@pingtimeout.net> Tested-by: Antti Tönkyrä <daedalus@pingtimeout.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd4: need to destroy revoked delegations in destroy_clientBenny Halevy2013-10-291-0/+5
| | | | | | | | | | | | | | [use list_splice_init] Signed-off-by: Benny Halevy <bhalevy@primarydata.com> [bfields: no need for recall_lock here] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd: no need to unhash_stid before freeBenny Halevy2013-10-291-5/+2
| | | | | | | | | | | | | | | | idr_remove is about to be called before kmem_cache_free so unhashing it is redundant Signed-off-by: Benny Halevy <bhalevy@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd: remove_stid can be incorporated into nfs4_put_delegationBenny Halevy2013-10-281-3/+1
| | | | | | | | | | | | | | All calls to nfs4_put_delegation are preceded with remove_stid. Signed-off-by: Benny Halevy <bhalevy@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd: nfs4_open_delegation needs to remove_stid rather than unhash_stidBenny Halevy2013-10-281-1/+1
| | | | | | | | | | | | | | | | In the out_free: path, the newly allocated stid must be removed rather than unhashed so it can never be found. Signed-off-by: Benny Halevy <bhalevy@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd: nfs4_free_stidBenny Halevy2013-10-281-2/+7
| | | | | | | | | | | | | | Make it symmetric to nfs4_alloc_stid. Signed-off-by: Benny Halevy <bhalevy@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd: fix Kconfig syntaxChristoph Hellwig2013-10-261-1/+1
| | | | | | | | | | | | | | | | | | The description text for CONFIG_NFSD_V4_SECURITY_LABEL has an unpaired quote sign which breaks syntax highlighting for the nfsd Kconfig file. Remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * nfsd: switch to %p[dD]Al Viro2013-10-025-37/+31
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'for-linus' of ↵Linus Torvalds2013-11-162-11/+11
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "This pull fixes the empty_zero_page bug that Heiko reported, and includes one more cleanup from Al Viro" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: get rid of fdentry() btrfs: fix empty_zero_page misusage
| * | btrfs: get rid of fdentry()Al Viro2013-11-152-9/+4
| | | | | | | | | | | | | | | | | | | | | 3 of 4 callers actually want file_inode()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | btrfs: fix empty_zero_page misusageChris Mason2013-11-151-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Heiko Carstens noticed that btrfs was using empty_zero_page incorrectly. He explained: The definition of empty_zero_page is architecture specific. It is (currently) either a character array, an unsigned long containing the address of the empty_zero_page, or even worse only the address of the struct page belonging to the empty_zero_page. This commit changes btrfs to use a for-loop instead. On x86 the resulting .ko is smaller, and we're no longer worrying about how each arch builds its zeros. Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2013-11-161-1/+2
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina: "Usual earth-shaking, news-breaking, rocket science pile from trivial.git" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits) doc: usb: Fix typo in Documentation/usb/gadget_configs.txt doc: add missing files to timers/00-INDEX timekeeping: Fix some trivial typos in comments mm: Fix some trivial typos in comments irq: Fix some trivial typos in comments NUMA: fix typos in Kconfig help text mm: update 00-INDEX doc: Documentation/DMA-attributes.txt fix typo DRM: comment: `halve' -> `half' Docs: Kconfig: `devlopers' -> `developers' doc: typo on word accounting in kprobes.c in mutliple architectures treewide: fix "usefull" typo treewide: fix "distingush" typo mm/Kconfig: Grammar s/an/a/ kexec: Typo s/the/then/ Documentation/kvm: Update cpuid documentation for steal time and pv eoi treewide: Fix common typo in "identify" __page_to_pfn: Fix typo in comment Correct some typos for word frequency clk: fixed-factor: Fix a trivial typo ...
| * | | Docs: Kconfig: `devlopers' -> `developers'Michael Witten2013-10-141-1/+2
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | | | Merge branch 'akpm' (patch-bomb from Andrew Morton)Linus Torvalds2013-11-1510-55/+56
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge patches from Andrew Morton: - memstick fixes - the rest of MM - various misc bits that were awaiting merges from linux-next into mainline: seq_file, printk, rtc, completions, w1, softirqs, llist, kfifo, hfsplus * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (72 commits) cmdline-parser: fix build hfsplus: Fix undefined __divdi3 in hfsplus_init_header_node() kfifo API type safety kfifo: kfifo_copy_{to,from}_user: fix copied bytes calculation sound/core/memalloc.c: use gen_pool_dma_alloc() to allocate iram buffer llists-move-llist_reverse_order-from-raid5-to-llistc-fix llists: move llist_reverse_order from raid5 to llist.c kernel: fix generic_exec_single indentation kernel-provide-a-__smp_call_function_single-stub-for-config_smp-fix kernel: provide a __smp_call_function_single stub for !CONFIG_SMP kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS revert "softirq: Add support for triggering softirq work on softirqs" drivers/w1/masters/w1-gpio.c: use dev_get_platdata() sched: remove INIT_COMPLETION tree-wide: use reinit_completion instead of INIT_COMPLETION sched: replace INIT_COMPLETION with reinit_completion drivers/rtc/rtc-hid-sensor-time.c: enable HID input processing early drivers/rtc/rtc-hid-sensor-time.c: use dev_get_platdata() vsprintf: ignore %n again seq_file: remove "%n" usage from seq_file users ...
| * | | | hfsplus: Fix undefined __divdi3 in hfsplus_init_header_node()Geert Uytterhoeven2013-11-151-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ERROR: "__divdi3" [fs/hfsplus/hfsplus.ko] undefined! Introduced by commit 099e9245e04d ("hfsplus: implement attributes file's header node initialization code"). i_size_read() returns loff_t, which is long long, i.e. 64-bit. node_size is size_t, which is either 32-bit or 64-bit. Hence "i_size_read(attr_file) / node_size" is a 64-by-32 or 64-by-64 division, causing (some versions of) gcc to emit a call to __divdi3(). Fortunately node_size is actually 16-bit, as the sole caller of hfsplus_init_header_node() passes a u16. Hence change its type from size_t to u16, and use do_div() to perform a 64-by-32 division. Not seen in m68k/allmodconfig in -next, so it really depends on the verion of gcc. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Vyacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | tree-wide: use reinit_completion instead of INIT_COMPLETIONWolfram Sang2013-11-153-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use this new function to make code more comprehensible, since we are reinitialzing the completion, not initializing. [akpm@linux-foundation.org: linux-next resyncs] Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> (personally at LCE13) Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | seq_file: remove "%n" usage from seq_file usersTetsuo Handa2013-11-154-40/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All seq_printf() users are using "%n" for calculating padding size, convert them to use seq_setwidth() / seq_pad() pair. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Joe Perches <joe@perches.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | seq_file: introduce seq_setwidth() and seq_pad()Tetsuo Handa2013-11-151-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are several users who want to know bytes written by seq_*() for alignment purpose. Currently they are using %n format for knowing it because seq_*() returns 0 on success. This patch introduces seq_setwidth() and seq_pad() for allowing them to align without using %n format. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Joe Perches <joe@perches.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | mm, hugetlb: convert hugetlbfs to use split pmd lockKirill A. Shutemov2013-11-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hugetlb supports multiple page sizes. We use split lock only for PMD level, but not for PUD. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | mm, thp: change pmd_trans_huge_lock() to return taken lockKirill A. Shutemov2013-11-151-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With split ptlock it's important to know which lock pmd_trans_huge_lock() took. This patch adds one more parameter to the function to return the lock. In most places migration to new api is trivial. Exception is move_huge_pmd(): we need to take two locks if pmd tables are different. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | mm: convert mm->nr_ptes to atomic_long_tKirill A. Shutemov2013-11-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With split page table lock for PMD level we can't hold mm->page_table_lock while updating nr_ptes. Let's convert it to atomic_long_t to avoid races. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Alex Thorlton <athorlton@sgi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | Merge branch 'for-linus' of ↵Linus Torvalds2013-11-1548-973/+2600
|\ \ \ \ \ | |/ / / / |/| | / / | | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs update frm Chris Mason: "This is our usual merge window set of bug fixes, performance improvements and cleanups. Miao Xie has some really nice optimizations for writeback. Josef also expanded our sanity checks quite a bit; these make up a big chunk of the new lines" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (98 commits) Btrfs: rename btrfs_start_all_delalloc_inodes Btrfs: don't wait for the completion of all the ordered extents Btrfs: don't wait for all the async delalloc when shrinking delalloc Btrfs: fix the confusion between delalloc bytes and metadata bytes Btrfs: pick up the code for the item number calculation in flush_space() Btrfs: wait for the ordered extent only when we want Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc() Btrfs: avoid unnecessary scrub workers allocation Btrfs: check file extent type before anything else btrfs: Remove useless variable in write_ctree_super() btrfs: Fix checkpatch.pl warning of spacing issues btrfs: Replace kmalloc with kmalloc_array btrfs: Enclose macros with complex values within parenthesis btrfs: Use WARN_ON()'s return value in place of WARN_ON(1) btrfs: Remove redundant local zero structure btrfs: Pack struct btrfs_device btrfs: Replace multiple atomic_inc() with atomic_add() btrfs: Add helper function for free_root_pointers() Btrfs: fix a crash when running balance and defrag concurrently Btrfs: do not run snapshot-aware defragment on error ...
| * | | Btrfs: rename btrfs_start_all_delalloc_inodesMiao Xie2013-11-127-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rename the function -- btrfs_start_all_delalloc_inodes(), and make its name be compatible to btrfs_wait_ordered_roots(), since they are always used at the same place. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | Btrfs: don't wait for the completion of all the ordered extentsMiao Xie2013-11-128-18/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is very likely that there are lots of ordered extents in the filesytem, if we wait for the completion of all of them when we want to reclaim some space for the metadata space reservation, we would be blocked for a long time. The performance would drop down suddenly for a long time. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | Btrfs: don't wait for all the async delalloc when shrinking delallocMiao Xie2013-11-121-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was very likely that there were lots of async delalloc pages in the filesystem, if we waited until all the pages were flushed, we would be blocked for a long time, and the performance would also drop down. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | Btrfs: fix the confusion between delalloc bytes and metadata bytesMiao Xie2013-11-121-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In shrink_delalloc(), what we need reclaim is the metadata space, so flushing pages by to_reclaim is not reasonable, it is very likely that the pages we flush are not enough. And then we had to invoke the flush function for several times, at the worst, we need call flush_space for several times. It wasted time. We improve this problem by converting the metadata space size we need reserve to the delalloc bytes, By this way, we can flush the pages by a reasonable number. (Now we use a fixed number to do conversion, it is not flexible, maybe we can find a good way to improve it in the future.) Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | Btrfs: pick up the code for the item number calculation in flush_space()Miao Xie2013-11-121-9/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch picked up the code that was used to calculate the number of the items for which we need reserve space, and we will use it in the next patch. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | Btrfs: wait for the ordered extent only when we wantMiao Xie2013-11-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc()Miao Xie2013-11-121-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | Btrfs: avoid unnecessary scrub workers allocationWang Shilong2013-11-121-13/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We only allocate scrub workers if we pass all the necessary checks, for example, there are no operation in progress. Besides, move mutex lock protection outside of scrub_workers_get() /scrub_workers_put(). Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | Btrfs: check file extent type before anything elseJosef Bacik2013-11-121-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I hit this problem with my no holes patch and it made me realize what the problem was for bz 60834. If the first item in the leaf is an inline extent and we try to read anything starting from disk_bytenr onward we will read off the end of the leaf. So we need to check to see what it's type is, and if it's not REG we can just break out. This should fix this problem. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | btrfs: Remove useless variable in write_ctree_super()Rashika2013-11-121-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function write_ctree_super() in disk-io.c uses variable ret to return the result of function write_all_supers(). Since, this variable serves no purpose, hence the patch removes it and returns the call of the called function. Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | btrfs: Fix checkpatch.pl warning of spacing issuesDulshani Gunawardhana2013-11-129-19/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix spacing issues detected via checkpatch.pl in accordance with the kernel style guidelines. Signed-off-by: Dulshani Gunawardhana <dulshani.gunawardhana89@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | btrfs: Replace kmalloc with kmalloc_arrayDulshani Gunawardhana2013-11-124-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace kmalloc(size * nr, ) with kmalloc_array(nr, size), thus making it easier to check is that the calculation doesn't wrap or return a smaller allocation Signed-off-by: Dulshani Gunawardhana <dulshani.gunawardhana89@gmail.com> Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | btrfs: Enclose macros with complex values within parenthesisDulshani Gunawardhana2013-11-121-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enclose macros with complex values within parenthesis in accordance to checkpatch.pl. Signed-off-by: Dulshani Gunawardhana <dulshani.gunawardhana89@gmail.com> Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | btrfs: Use WARN_ON()'s return value in place of WARN_ON(1)Dulshani Gunawardhana2013-11-1214-70/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use WARN_ON()'s return value in place of WARN_ON(1) for cleaner source code that outputs a more descriptive warnings. Also fix the styling warning of redundant braces that came up as a result of this fix. Signed-off-by: Dulshani Gunawardhana <dulshani.gunawardhana89@gmail.com> Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | btrfs: Remove redundant local zero structureDulshani Gunawardhana2013-11-121-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove redundant local zero structure, replacing it by the kernel's global ZERO_PAGE. Signed-off-by: Dulshani Gunawardhana <dulshani.gunawardhana89@gmail.com> Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | btrfs: Pack struct btrfs_deviceDulshani Gunawardhana2013-11-121-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pack the structure btrfs_device in volumes.h to eliminate holes detected by pahole, thus reducing binary memory footprint. Signed-off-by: Dulshani Gunawardhana <dulshani.gunawardhana89@gmail.com> Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | btrfs: Replace multiple atomic_inc() with atomic_add()Rashika2013-11-121-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces multiple atomic_inc() with atomic_add() in delayed-inode.c to reduce source code and have few instructions for compilation. Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | btrfs: Add helper function for free_root_pointers()Rashika2013-11-121-41/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function free_root_pointers() in disk-io.h contains redundant code. Therefore, this patch adds a helper function free_root_extent_buffers() to free_root_pointers() to eliminate redundancy. Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>