summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'powerpc-4.5-2' of ↵Linus Torvalds2016-01-3011-56/+35
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Wire up copy_file_range() syscall from Chandan Rajendra - Simplify module TOC handling from Alan Modra - Remove newly added extra definition of pmd_dirty from Stephen Rothwell - Allow user space to map rtas_rmo_buf from Vasant Hegde - Fix PE location code from Gavin Shan - Remove PPMU_HAS_SSLOT flag for Power8 from Madhavan Srinivasan - Fixup _HPAGE_CHG_MASK from Aneesh Kumar K.V * tag 'powerpc-4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Fixup _HPAGE_CHG_MASK powerpc/perf: Remove PPMU_HAS_SSLOT flag for Power8 powerpc/eeh: Fix PE location code powerpc/mm: Allow user space to map rtas_rmo_buf powerpc: Remove newly added extra definition of pmd_dirty powerpc: Simplify module TOC handling powerpc: Wire up copy_file_range() syscall
| * powerpc/mm: Fixup _HPAGE_CHG_MASKAneesh Kumar K.V2016-01-281-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | This was wrongly updated by commit 7aa9a23c69ea ("powerpc, thp: remove infrastructure for handling splitting PMDs") during the last merge window. Fix it up. This could lead to incorrect behaviour in THP and/or mprotect(), at a minimum. Fixes: 7aa9a23c69ea ("powerpc, thp: remove infrastructure for handling splitting PMDs") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/perf: Remove PPMU_HAS_SSLOT flag for Power8Madhavan Srinivasan2016-01-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 7a7868326d77 ("powerpc/perf: Add an explict flag indicating presence of SLOT field") introduced the PPMU_HAS_SSLOT flag to remove the assumption that MMCRA[SLOT] was present when PPMU_ALT_SIPR was not set. That commit's changelog also mentions that Power8 does not support MMCRA[SLOT]. However when the Power8 PMU support was merged, it errnoeously included the PPMU_HAS_SSLOT flag. So remove PPMU_HAS_SSLOT from the Power8 flags. mpe: On systems where MMCRA[SLOT] exists, the field occupies bits 37:39 (IBM numbering). On Power8 bit 37 is reserved, and 38:39 overlap with the high bits of the Threshold Event Counter Mantissa. I am not aware of any published events which use the threshold counting mechanism, which would cause the mantissa bits to be set. So in practice this bug is unlikely to trigger. Fixes: e05b9b9e5c10 ("powerpc/perf: Power8 PMU support") Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/eeh: Fix PE location codeGavin Shan2016-01-271-18/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In eeh_pe_loc_get(), the PE location code is retrieved from the "ibm,loc-code" property of the device node for the bridge of the PE's primary bus. It's not correct because the property indicates the parent PE's location code. This reads the correct PE location code from "ibm,io-base-loc-code" or "ibm,slot-location-code" property of PE parent bus's device node. Cc: stable@vger.kernel.org # v3.16+ Fixes: 357b2f3dd9b7 ("powerpc/eeh: Dump PE location code") Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/mm: Allow user space to map rtas_rmo_bufVasant Hegde2016-01-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With commit 90a545e9 (restrict /dev/mem to idle io memory ranges) mapping rtas_rmo_buf from user space is failing. Hence we are not able to make RTAS syscall. This patch calls page_is_rtas_user_buf before calling iomem_is_exclusive in devmem_is_allowed(). This will allow user space to map rtas_rmo_buf and we are able to make RTAS syscall. Reported-by: Bharata B Rao <bharata@linux.vnet.ibm.com> CC: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc: Remove newly added extra definition of pmd_dirtyStephen Rothwell2016-01-211-1/+0
| | | | | | | | | | | | | | | | | | | | | | Commit d5d6a443b243 ("arch/powerpc/include/asm/pgtable-ppc64.h: add pmd_[dirty|mkclean] for THP") added a new identical definition of pmd_dirty(). Remove it again. Cc: Minchan Kim <minchan@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc: Simplify module TOC handlingAlan Modra2016-01-213-32/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PowerPC64 uses the symbol .TOC. much as other targets use _GLOBAL_OFFSET_TABLE_. It identifies the value of the GOT pointer (or in powerpc parlance, the TOC pointer). Global offset tables are generally local to an executable or shared library, or in the kernel, module. Thus it does not make sense for a module to resolve a relocation against .TOC. to the kernel's .TOC. value. A module has its own .TOC., and indeed the powerpc64 module relocation processing ignores the kernel value of .TOC. and instead calculates a module-local value. This patch removes code involved in exporting the kernel .TOC., tweaks modpost to ignore an undefined .TOC., and the module loader to twiddle the section symbol so that .TOC. isn't seen as undefined. Note that if the kernel was compiled with -msingle-pic-base then ELFv2 would not have function global entry code setting up r2. In that case the module call stubs would need to be modified to set up r2 using the kernel .TOC. value, requiring some of this code to be reinstated. mpe: Furthermore a change in binutils master (not yet released) causes the current way we handle the TOC to no longer work when building with MODVERSIONS=y and RELOCATABLE=n. The symptom is that modules can not be loaded due to there being no version found for TOC. Cc: stable@vger.kernel.org # 3.16+ Signed-off-by: Alan Modra <amodra@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc: Wire up copy_file_range() syscallChandan Rajendra2016-01-213-1/+3
| | | | | | | | | | | | | | | | | | Test runs on a ppc64 BE guest succeeded using modified fstests. Also tested on ppc64 LE using a home made test - mpe. Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | Merge branch 'for-linus' of ↵Linus Torvalds2016-01-3037-189/+192
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "An optimization for irq-restore, the SSM instruction is quite a bit slower than an if-statement and a STOSM. The copy_file_range system all is added. Cleanup for PCI and CIO. And a couple of bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cio: update measurement characteristics s390/cio: ensure consistent measurement state s390/cio: fix measurement characteristics memleak s390/zcrypt: Fix cryptographic device id in kernel messages s390/pci: remove iomap sanity checks s390/pci: set error state for unusable functions s390/pci: fix bar check s390/pci: resize iomap s390/pci: improve ZPCI_* macros s390/pci: provide ZPCI_ADDR macro s390/pci: adjust IOMAP_MAX_ENTRIES s390/numa: move numa_init_late() from device to arch_initcall s390: remove all usages of PSW_ADDR_INSN s390: remove all usages of PSW_ADDR_AMODE s390: wire up copy_file_range syscall s390: remove superfluous memblock_alloc() return value checks s390/numa: allocate memory with correct alignment s390/irqflags: optimize irq restore s390/mm: use TASK_MAX_SIZE where applicable
| * | s390/cio: update measurement characteristicsSebastian Ott2016-01-262-9/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Per channel path measurement characteristics are obtained during channel path registration. However if some properties of a channel path change we don't update the measurement characteristics. Make sure to update the characteristics when we change the properties of a channel path or receive a notification from FW about such a change. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/cio: ensure consistent measurement stateSebastian Ott2016-01-262-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure that in all cases where we could not obtain measurement characteristics the associated fields are set to invalid values. Note: without this change the "shared" capability of a channel path for which we could not obtain the measurement characteristics was incorrectly displayed as 0 (not shared). We will now correctly report "unknown" in this case. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/cio: fix measurement characteristics memleakSebastian Ott2016-01-263-18/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Measurement characteristics are allocated during channel path registration but not freed during deregistration. Fix this by embedding these characteristics inside struct channel_path. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/zcrypt: Fix cryptographic device id in kernel messagesIngo Tuchscherer2016-01-263-20/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, on card response failures a combination of card domain and domain id is recorded in the kernel messages. According to the message description only the card id will be recorded. The domain id is not relevant, since the whole card including all domains is set offline. Signed-off-by: Ingo Tuchscherer <ingo.tuchscherer@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/pci: remove iomap sanity checksSebastian Ott2016-01-261-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since each iomap_entry handles only one bar of one pci function (even when disjunct ranges of a bar are mapped) the sanity check in pci_iomap_range is not needed and can be removed. Also convert the remaining BUG_ONs to WARN_ONs. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/pci: set error state for unusable functionsSebastian Ott2016-01-261-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We receive special notifications from firmware when an error was detected and a pci function became unusable. Set the error_state accordingly to give device drivers a hint that they don't need to try error recovery. Suggested-by: Alexander Schmidt <alexschm@de.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/pci: fix bar checkSebastian Ott2016-01-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Fix the check which bar space we should map to allow available bars only. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/pci: resize iomapSebastian Ott2016-01-261-12/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On s390 we need to maintain a mapping between iomem addresses and arch specific function identifiers. Currently the mapping table is created as such that we could span the whole iomem address space. Since we can only map each bar space from each possible function we have an upper bound for the number of mapping entries. This reduces the size of the iomap from 256K to less than 4K (using the defconfig). Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/pci: improve ZPCI_* macrosSebastian Ott2016-01-262-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Most of the constants defined in pci_io.h depend on each other and thus can be calculated. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/pci: provide ZPCI_ADDR macroSebastian Ott2016-01-262-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Provide and use a ZPCI_ADDR macro as the complement of ZPCI_IDX to get rid of some constants in the code. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/pci: adjust IOMAP_MAX_ENTRIESSebastian Ott2016-01-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | ZPCI_IOMAP_MAX_ENTRIES is off by one. Let's adjust this for the sake of correctness. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/numa: move numa_init_late() from device to arch_initcallMichael Holzheu2016-01-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3e89e1c5ea ("hugetlb: make mm and fs code explicitly non-modular") moves hugetlb_init() from module_init to subsys_initcall. The hugetlb_init()->hugetlb_register_node() code accesses "node->dev.kobj" which is initialized in numa_init_late(). Since numa_init_late() is a device_initcall which is called *after* subsys_initcall the above mentioned patch breaks NUMA on s390. So fix this and move numa_init_late() to arch_initcall. Fixes: 3e89e1c5ea ("hugetlb: make mm and fs code explicitly non-modular") Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390: remove all usages of PSW_ADDR_INSNHeiko Carstens2016-01-1914-49/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yet another leftover from the 31 bit era. The usual operation "y = x & PSW_ADDR_INSN" with the PSW_ADDR_INSN mask is a nop for CONFIG_64BIT. Therefore remove all usages and hope the code is a bit less confusing. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
| * | s390: remove all usages of PSW_ADDR_AMODEHeiko Carstens2016-01-1910-33/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is a leftover from the 31 bit area. For CONFIG_64BIT the usual operation "y = x | PSW_ADDR_AMODE" is a nop. Therefore remove all usages of PSW_ADDR_AMODE and make the code a bit less confusing. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
| * | s390: wire up copy_file_range syscallHeiko Carstens2016-01-193-1/+4
| | | | | | | | | | | | | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390: remove superfluous memblock_alloc() return value checksHeiko Carstens2016-01-193-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | memblock_alloc() and memblock_alloc_base() will panic on their own if they can't find free memory. Therefore remove some pointless checks. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/numa: allocate memory with correct alignmentHeiko Carstens2016-01-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allocating memory with a requested minimum alignment of 1 is wrong since pg_data_t contains a spinlock which requires an alignment of 4 bytes. Therefore fix this and ask for an alignment of 8 bytes like it is guarenteed for all kmalloc requests. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/irqflags: optimize irq restoreChristian Borntraeger2016-01-192-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ssm instruction takes longer that stnsm/stosm as it is often used to modify DAT and PER. We know that irqsave/irqrestore only deals with external and I/O interrupts and we know that irqrestore can transition only from disabled->disabled or disabled->enabled, so we can use the faster stosm. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | s390/mm: use TASK_MAX_SIZE where applicableDominik Dingel2016-01-192-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | To improve readability we can use TASK_MAX_SIZE when we just check for the upper limit. All places explicitly dealing with 3 vs 4 level pgtables were left unchanged. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-By: Sascha Silbe <silbe@linux.vnet.ibm.com>
* | | Merge branch 'for-linus-4.5' of ↵Linus Torvalds2016-01-3011-32/+113
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "Dave had a small collection of fixes to the new free space tree code, one of which was keeping our sysfs files more up to date with feature bits as different things get enabled (lzo, raid5/6, etc). I should have kept the sysfs stuff for rc3, since we always manage to trip over something. This time it was GFP_KERNEL from somewhere that is NOFS only. Instead of rebasing it out I've put a revert in, and we'll fix it properly for rc3. Otherwise, Filipe fixed a btrfs DIO race and Qu Wenruo fixed up a use-after-free in our tracepoints that Dave Jones reported" * 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Revert "btrfs: synchronize incompat feature bits with sysfs files" btrfs: don't use GFP_HIGHMEM for free-space-tree bitmap kzalloc btrfs: sysfs: check initialization state before updating features Revert "btrfs: clear PF_NOFREEZE in cleaner_kthread()" btrfs: async-thread: Fix a use-after-free error for trace Btrfs: fix race between fsync and lockless direct IO writes btrfs: add free space tree to the cow-only list btrfs: add free space tree to lockdep classes btrfs: tweak free space tree bitmap allocation btrfs: tests: switch to GFP_KERNEL btrfs: synchronize incompat feature bits with sysfs files btrfs: sysfs: introduce helper for syncing bits with sysfs files btrfs: sysfs: add free-space-tree bit attribute btrfs: sysfs: fix typo in compat_ro attribute definition
| * | | Revert "btrfs: synchronize incompat feature bits with sysfs files"Chris Mason2016-01-294-17/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 14e46e04958df740c6c6a94849f176159a333f13. This ends up doing sysfs operations from deep in balance (where we should be GFP_NOFS) and under heavy balance load, we're making races against sysfs internals. Revert it for now while we figure things out. Signed-off-by: Chris Mason <clm@fb.com>
| * | | btrfs: don't use GFP_HIGHMEM for free-space-tree bitmap kzallocChris Mason2016-01-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This was copied incorrectly from the __vmalloc call. Signed-off-by: Chris Mason <clm@fb.com>
| * | | Merge branch 'dev/fst-followup' of ↵Chris Mason2016-01-276-18/+34
| |\ \ \ | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.5
| | * | | btrfs: add free space tree to the cow-only listDavid Sterba2016-01-251-1/+2
| | | | | | | | | | | | | | | | | | | | Signed-off-by: David Sterba <dsterba@suse.com>
| | * | | btrfs: add free space tree to lockdep classesDavid Sterba2016-01-251-0/+1
| | | | | | | | | | | | | | | | | | | | Signed-off-by: David Sterba <dsterba@suse.com>
| | * | | btrfs: tweak free space tree bitmap allocationDavid Sterba2016-01-221-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The requested bitmap size varies, observed numbers were < 4K up to 16K. Using vmalloc unconditionally would be too heavy, we'll try contiguous allocations first and fall back to vmalloc if there's no contig memory. Signed-off-by: David Sterba <dsterba@suse.com>
| | * | | btrfs: tests: switch to GFP_KERNELDavid Sterba2016-01-223-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's no reason to do GFP_NOFS in tests, it's not data-heavy and memory allocation failures would affect only developers or testers. Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: sysfs: check initialization state before updating featuresDavid Sterba2016-01-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the mount phase is not finished, we can't update the sysfs files. Reported-by: Chris Mason <clm@fb.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
| * | | | Revert "btrfs: clear PF_NOFREEZE in cleaner_kthread()"David Sterba2016-01-261-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 696249132158014d594896df3a81390616069c5c. The cleaner thread can block freezing when there's a snapshot cleaning in progress and the other threads get suspended first. From the logs provided by Martin we're waiting for reading extent pages: kernel: PM: Syncing filesystems ... done. kernel: Freezing user space processes ... (elapsed 0.015 seconds) done. kernel: Freezing remaining freezable tasks ... kernel: Freezing of tasks failed after 20.003 seconds (1 tasks refusing to freeze, wq_busy=0): kernel: btrfs-cleaner D ffff88033dd13bc0 0 152 2 0x00000000 kernel: ffff88032ebc2e00 ffff88032e750000 ffff88032e74fa50 7fffffffffffffff kernel: ffffffff814a58df 0000000000000002 ffffea000934d580 ffffffff814a5451 kernel: 7fffffffffffffff ffffffff814a6e8f 0000000000000000 0000000000000020 kernel: Call Trace: kernel: [<ffffffff814a58df>] ? bit_wait+0x2c/0x2c kernel: [<ffffffff814a5451>] ? schedule+0x6f/0x7c kernel: [<ffffffff814a6e8f>] ? schedule_timeout+0x2f/0xd8 kernel: [<ffffffff81076f94>] ? timekeeping_get_ns+0xa/0x2e kernel: [<ffffffff81077603>] ? ktime_get+0x36/0x44 kernel: [<ffffffff814a4f6c>] ? io_schedule_timeout+0x94/0xf2 kernel: [<ffffffff814a4f6c>] ? io_schedule_timeout+0x94/0xf2 kernel: [<ffffffff814a590b>] ? bit_wait_io+0x2c/0x30 kernel: [<ffffffff814a5694>] ? __wait_on_bit+0x41/0x73 kernel: [<ffffffff8109eba8>] ? wait_on_page_bit+0x6d/0x72 kernel: [<ffffffff8105d718>] ? autoremove_wake_function+0x2a/0x2a kernel: [<ffffffff811a02d7>] ? read_extent_buffer_pages+0x1bd/0x203 kernel: [<ffffffff8117d9e9>] ? free_root_pointers+0x4c/0x4c kernel: [<ffffffff8117e831>] ? btree_read_extent_buffer_pages.constprop.57+0x5a/0xe9 kernel: [<ffffffff8117f4f3>] ? read_tree_block+0x2d/0x45 kernel: [<ffffffff8116782a>] ? read_block_for_search.isra.34+0x22a/0x26b kernel: [<ffffffff811656c3>] ? btrfs_set_path_blocking+0x1e/0x4a kernel: [<ffffffff8116919b>] ? btrfs_search_slot+0x648/0x736 kernel: [<ffffffff81170559>] ? btrfs_lookup_extent_info+0xb7/0x2c7 kernel: [<ffffffff81170ee5>] ? walk_down_proc+0x9c/0x1ae kernel: [<ffffffff81171c9d>] ? walk_down_tree+0x40/0xa4 kernel: [<ffffffff8117375f>] ? btrfs_drop_snapshot+0x2da/0x664 kernel: [<ffffffff8104ff21>] ? finish_task_switch+0x126/0x167 kernel: [<ffffffff811850f8>] ? btrfs_clean_one_deleted_snapshot+0xa6/0xb0 kernel: [<ffffffff8117eaba>] ? cleaner_kthread+0x13e/0x17b kernel: [<ffffffff8117e97c>] ? btrfs_item_end+0x33/0x33 kernel: [<ffffffff8104d256>] ? kthread+0x95/0x9d kernel: [<ffffffff8104d1c1>] ? kthread_parkme+0x16/0x16 kernel: [<ffffffff814a7b5f>] ? ret_from_fork+0x3f/0x70 kernel: [<ffffffff8104d1c1>] ? kthread_parkme+0x16/0x16 As this affects a released kernel (4.4) we need a minimal fix for stable kernels. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=108361 Reported-by: Martin Ziegler <ziegler@uni-freiburg.de> CC: stable@vger.kernel.org # 4.4 CC: Jiri Kosina <jkosina@suse.cz> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
| * | | | btrfs: async-thread: Fix a use-after-free error for traceQu Wenruo2016-01-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Parameter of trace_btrfs_work_queued() can be freed in its workqueue. So no one use use that pointer after queue_work(). Fix the user-after-free bug by move the trace line before queue_work(). Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
| * | | | Btrfs: fix race between fsync and lockless direct IO writesFilipe Manana2016-01-262-11/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An fsync, using the fast path, can race with a concurrent lockless direct IO write and end up logging a file extent item that points to an extent that wasn't written to yet. This is because the fast fsync path collects ordered extents into a local list and then collects all the new extent maps to log file extent items based on them, while the direct IO write path creates the new extent map before it creates the corresponding ordered extent (and submitting the respective bio(s)). So fix this by making the direct IO write path create ordered extents before the extent maps and make the fast fsync path collect any new ordered extents after it collects the extent maps. Note that making the fsync handler call inode_dio_wait() (after acquiring the inode's i_mutex) would not work and lead to a deadlock when doing AIO, as through AIO we end up in a path where the fsync handler is called (through dio_aio_complete_work() -> dio_complete() -> vfs_fsync_range()) before the inode's dio counter is decremented (inode_dio_wait() waits for this counter to have a value of zero). Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
| * | | | Merge branch 'fix/fst-sysfs' of ↵Chris Mason2016-01-266-1/+53
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.5 Signed-off-by: Chris Mason <clm@fb.com>
| | * | | | btrfs: synchronize incompat feature bits with sysfs filesDavid Sterba2016-01-214-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The files under /sys/fs/UUID/features get out of sync with the actual incompat bits set for the filesystem if they change after mount (eg. the LZO compression). Synchronize the feature bits with the sysfs files representing them right after we set/clear them. Signed-off-by: David Sterba <dsterba@suse.com>
| | * | | | btrfs: sysfs: introduce helper for syncing bits with sysfs filesDavid Sterba2016-01-212-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The files under /sys/fs/UUID/features get out of sync with the actual incompat bits set for the filesystem if they change after mount. We're going to sync them and need a helper to do that. Signed-off-by: David Sterba <dsterba@suse.com>
| | * | | | btrfs: sysfs: add free-space-tree bit attributeDavid Sterba2016-01-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The incompat bit representing the newly added free space tree feature is missing. Right now it will be listed only among features supported by the module, not per-fs. Signed-off-by: David Sterba <dsterba@suse.com>
| | * | | | btrfs: sysfs: fix typo in compat_ro attribute definitionDavid Sterba2016-01-201-1/+1
| | |/ / / | | | | | | | | | | | | | | | Signed-off-by: David Sterba <dsterba@suse.com>
* | | | | Merge tag 'pm+acpi-4.5-rc2' of ↵Linus Torvalds2016-01-3011-45/+45
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI fixes from Rafael Wysocki: "These are: cpuidle fixes (including one fix for a recent regression), cpufreq fixes (including fixes for two issues introduced during the 4.2 cycle), generic power domains framework fixes (two locking fixes and one cleanup), one locking fix in the ACPI-based PCI hotplug framework (ACPIPHP), removal of one ACPI backlight blacklist entry that isn't necessary any more and a PM Kconfig cleanup. Specifics: - Fix a recent cpuidle core regression that broke suspend-to-idle on all systems where cpuidle drivers don't provide ->enter_freeze callbacks for any states (Sudeep Holla). - Drop an unnecessary symbol definition from the cpuidle core code handling coupled CPU cores (Anders Roxell). - Fix a race condition related to governor initialization and removal in the cpufreq core (Viresh Kumar). - Clean up the cpufreq core to use list_is_last() for checking if the given policy object is the last element of a list instead of open coding that in a clumsy way (Gautham R Shenoy). - Fix compiler warnings in the pxa2xx and cpufreq-dt cpufreq drivers (Arnd Bergmann). - Fix two locking issues and clean up a comment in the generic power domains framework (Ulf Hansson, Marek Szyprowski, Moritz Fischer). - Fix the error code path of one function in the ACPI-based PCI hotplug framework (ACPIPHP) that forgets to release a lock acquired previously (Insu Yun). - Drop the ACPI backlight blacklist entry for Dell Inspiron 5737 that is not necessary any more (Hans de Goede). - Clean up the top-level PM Kconfig to stop requiring APM emulation to depend on PM which in fact isn't necessary (Arnd Bergmann)" * tag 'pm+acpi-4.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: cpufreq-dt: avoid uninitialized variable warnings: cpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype PM: APM_EMULATION does not depend on PM cpufreq: Use list_is_last() to check last entry of the policy list cpufreq: Fix NULL reference crash while accessing policy->governor_data cpuidle: coupled: remove unused define cpuidle_coupled_lock PM / Domains: Fix typo in comment PM / Domains: Fix potential deadlock while adding/removing subdomains ACPI / PCI / hotplug: unlock in error path in acpiphp_enable_slot() ACPI: Revert "ACPI / video: Add Dell Inspiron 5737 to the blacklist" cpuidle: fix fallback mechanism for suspend to idle in absence of enter_freeze PM / domains: fix lockdep issue for all subdomains
| | \ \ \ \
| | \ \ \ \
| | \ \ \ \
| | \ \ \ \
| | \ \ \ \
| | \ \ \ \
| *-----. \ \ \ \ Merge branches 'pm-cpuidle', 'pm-cpufreq', 'pm-domains' and 'pm-sleep'Rafael J. Wysocki2016-01-299-36/+42
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * pm-cpuidle: cpuidle: coupled: remove unused define cpuidle_coupled_lock cpuidle: fix fallback mechanism for suspend to idle in absence of enter_freeze * pm-cpufreq: cpufreq: cpufreq-dt: avoid uninitialized variable warnings: cpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype cpufreq: Use list_is_last() to check last entry of the policy list cpufreq: Fix NULL reference crash while accessing policy->governor_data * pm-domains: PM / Domains: Fix typo in comment PM / Domains: Fix potential deadlock while adding/removing subdomains PM / domains: fix lockdep issue for all subdomains * pm-sleep: PM: APM_EMULATION does not depend on PM
| | | | | * | | | | PM: APM_EMULATION does not depend on PMArnd Bergmann2016-01-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The APM emulation code does multiple things, and some of them depend on PM_SLEEP, while the battery management does not. However, selecting the symbol like SHARPSL_PM does causes a Kconfig warning: warning: (SHARPSL_PM && PMAC_APM_EMU) selects APM_EMULATION which has unmet direct dependencies (PM && SYS_SUPPORTS_APM_EMULATION) From all I can tell, this is completely harmless, and we can simply allow APM_EMULATION to be enabled here, even if PM is not. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | | | * | | | | | PM / Domains: Fix typo in commentMoritz Fischer2016-01-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Acked-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Moritz Fischer <moritz.fischer@ettus.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | | | * | | | | | PM / Domains: Fix potential deadlock while adding/removing subdomainsUlf Hansson2016-01-271-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We must preserve the same order of how we acquire and release the lock for genpd, as otherwise we may encounter deadlocks. The power on phase of a genpd starts by acquiring its lock. Then it walks the hierarchy of its parent domains to be able to power on these first, as per design of genpd. From a locking perspective this means the locks of the parents becomes acquired after the lock of the subdomain. Let's fix pm_genpd_add|remove_subdomain() to maintain the same order of acquiring/releasing the genpd lock as being applied in the power on/off sequence. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>