summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ARM: Update mach-typesRussell King2011-05-141-64/+78
| | | | Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 6883/1: ptrace: Migrate to regsets frameworkDave Martin2011-05-143-109/+246
| | | | | | | | | | | | | | | | | | This patch migrates the implementation of the ptrace interface for the core integer registers, legacy FPA registers and VFP registers to use the regsets framework. As an added bonus, all this stuff gets included in coredumps at no extra cost. Without this patch, coredumps contained no VFP state. Third-party extension register sets (iwmmx, crunch) are not migrated by this patch, and continue to use the old implementation; these should be migratable without much extra work. Signed-off-by: Dave Martin <dave.martin@linaro.org> Acked-by: Will Deacon <Will.Deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 6882/1: ELF: Define new core note type for VFP registersDave Martin2011-05-141-0/+1
| | | | | | | | | | The VFP registers are not currently included in coredumps, and there's no existing note type where they can sensibly be included, so this patch defines a dedicated note type for them. Signed-off-by: Dave Martin <dave.martin@linaro.org> Acked-by: Will Deacon <Will.Deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 6893/1: Allow for kernel command line concatenationVictor Boivie2011-05-122-7/+27
| | | | | | | | | | | | | | | This patch allows the provided CONFIG_CMDLINE to be concatenated with the one provided by the boot loader. This is useful to merge the static values defined in CONFIG_CMDLINE with the boot loader's (possibly) more dynamic values, such as startup reasons and more. Signed-off-by: Victor Boivie <victor.boivie@sonyericsson.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@sonyericsson.com> Signed-off-by: Oskar Andero <oskar.andero@sonyericsson.com> Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 6889/1: futex: add SMP futex support when !CPU_USE_DOMAINSWill Deacon2011-05-121-47/+90
| | | | | | | | | | | | | This patch uses the load/store exclusive instructions to add SMP futex support for ARM. Since the ARM architecture does not provide instructions for unprivileged exclusive memory accesses, we can only provide SMP futexes when CPU domain support is disabled. Cc: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 6859/1: Add writethrough dcache support for ARM926EJS processorMark A. Greer2011-05-121-0/+16
| | | | | | | | | | | The ARM kernel supports writethrough data cache via the CONFIG_CPU_DCACHE_WRITETHROUGH option. However, that functionality wasn't implemented in the arch/arm/boot/compressed code. It is now necessary due to a new ARM926EJS processor that has an issue with writeback data cache. Signed-off-by: Mark A. Greer <mgreer@mvista.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: phys-to-virt: improve Kconfig help textsRussell King2011-05-121-4/+10
| | | | | | Improve the Kconfig help texts for the phys-to-virt patching feature. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: Highmem: drop experimental statusRussell King2011-05-121-2/+2
| | | | | | | Highmem on ARM has been around for a while now, without any major issues being raised. So, drop the experimental status of this feature. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: SMP: drop experimental statusRussell King2011-05-121-2/+1
| | | | | | | SMP on ARM has been around for a while now, without any major issues being raised. So, drop the experimental status of this feature. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* Merge branch 'for-linus' of ↵Linus Torvalds2011-05-125-11/+12
|\ | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: do not use i_wrbuffer_ref as refcount for Fb cap ceph: fix list_add in ceph_put_snap_realm ceph: print debug message before put mds session
| * ceph: do not use i_wrbuffer_ref as refcount for Fb capHenry C Chang2011-05-113-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We increments i_wrbuffer_ref when taking the Fb cap. This breaks the dirty page accounting and causes looping in __ceph_do_pending_vmtruncate, and ceph client hangs. This bug can be reproduced occasionally by running blogbench. Add a new field i_wb_ref to inode and dedicate it to Fb reference counting. Signed-off-by: Henry C Chang <henry.cy.chang@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: fix list_add in ceph_put_snap_realmHenry C Chang2011-05-111-1/+1
| | | | | | | | | | Signed-off-by: Henry C Chang <henry.cy.chang@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * ceph: print debug message before put mds sessionHenry C Chang2011-05-111-1/+1
| | | | | | | | | | | | | | | | The mds session, s, could be freed during ceph_put_mds_session. Move dout before ceph_put_mds_session. Signed-off-by: Henry C Chang <henry.cy.chang@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
* | Merge branch 'drm-fixes' of ↵Linus Torvalds2011-05-123-13/+14
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/radeon/nouveau: fix build regression on alpha due to Xen changes. drm/radeon/kms: fix cayman acceleration drm/radeon: fix cayman struct accessors.
| * | drm/radeon/nouveau: fix build regression on alpha due to Xen changes.Dave Airlie2011-05-112-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Xen changes were using DMA_ERROR_CODE which isn't defined on a few platforms, however we reverted the Xen patch that caused use to try and use this code path earlier in 2.6.39 cycle, so for now lets just force the code to never take this path and allow it to build again on alpha. The proper long term answer is probably to store if the dma_addr has been assigned to alongside the dma_addr in the higher level code, though I think Thomas wanted to rewrite most of this anyways properly. Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
| * | drm/radeon/kms: fix cayman accelerationAlex Deucher2011-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | The TCC disable setup was incorrect. This prevents the GPU from hanging when draw commands are issued. Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
| * | drm/radeon: fix cayman struct accessors.Dave Airlie2011-05-111-8/+8
| | | | | | | | | | | | | | | | | | | | | We are accessing totally the wrong struct in this case, and putting uninitialised values into the GPU, which it doesn't like unsurprisingly. Signed-off-by: Dave Airlie <airlied@redhat.com>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2011-05-123-6/+7
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: mfd: Fix for the TWL4030 PM sleep/wakeup sequence mfd: Fix asic3 build error mfd: Fixed gpio polarity of omap-usb gpio USB-phy reset
| * | | mfd: Fix for the TWL4030 PM sleep/wakeup sequenceLesly A M2011-05-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only configure sleep script when the flag is TWL4030_SLEEP_SCRIPT. Adding the missing brackets for fixing the issue. Signed-off-by: Lesly A M <leslyam@ti.com> Cc: Nishanth Menon <nm@ti.com> Cc: David Derrick <dderrick@ti.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
| * | | mfd: Fix asic3 build errorAxel Lin2011-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix below compile error: CC drivers/mfd/asic3.o drivers/mfd/asic3.c: In function 'asic3_irq_demux': drivers/mfd/asic3.c:147: error: 'irq_data' undeclared (first use in this function) drivers/mfd/asic3.c:147: error: (Each undeclared identifier is reported only once drivers/mfd/asic3.c:147: error: for each function it appears in.) Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
| * | | mfd: Fixed gpio polarity of omap-usb gpio USB-phy resetJuergen Kilb2011-05-111-4/+4
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With commit 19403165 a main part of ehci-omap.c moved to drivers/mfd/omap-usb-host.c created by commit 17cdd29d. Due to this reorganisation the polarity used to reset the external USB phy changed and USB host doesn't recognize any devices. Signed-off-by: Juergen Kilb <J.Kilb@phytec.de> Acked-by: Felipe Balbi <balbi@ti.com> Tested-by: Steve Sakoman <steve@sakoman.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
* | | Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6Linus Torvalds2011-05-1212-46/+40
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: [S390] fix alloc_pgste check in init_new_context [S390] oprofile: fix min/max interval query checks [S390] replace diag10() with diag10_range() function [S390] disassembler: handle b280/spp instruction [S390] kernel: Initialize register 14 when starting new CPU [S390] dasd: prevent IO error during reserve/release loop [S390] sclp/memory hotplug: fix initial usecount of increments
| * | | [S390] fix alloc_pgste check in init_new_contextMartin Schwidefsky2011-05-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Processes started with kernel_execve from a kernel thread will have current->mm==NULL. Reading current->mm->context.alloc_pgste will read a more or less random bit from lowcore in this case. If the bit turns out to be set the whole process tree started this way will allocate page table extensions although they have no need for it. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | [S390] oprofile: fix min/max interval query checksMartin Schwidefsky2011-05-103-18/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | oprofile_min_interval and oprofile_max_interval are unsigned, checking for negative values doesn't work. Change hwsampler_query_min_interval and hwsampler_query_max_interval to return an unsigned long and check for a zero value instead. Reported-by: Nicolas Kaiser <nikai@nikai.net> Acked-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | [S390] replace diag10() with diag10_range() functionMichael Holzheu2011-05-103-24/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the diag10() function can only release one page. For exploiters that have to call diag10 on a contiguous memory region this is suboptimal. This patch replaces the diag10() function with diag10_range() that is able to release multiple pages. In addition to that the new function now allows to release memory with addresses higher than 2047 MiB. This was due to a restriction of the diagnose implementation under z/VM prior to release 5.2. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | [S390] disassembler: handle b280/spp instructionChristian Borntraeger2011-05-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch/s390/kvm/sie64a.S uses the b280 instruction. Tell the builtin disassembler to handle that code. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | [S390] kernel: Initialize register 14 when starting new CPUMichael Holzheu2011-05-102-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When starting a new CPU we currently jump to start_secondary() without setting register 14 (the return address) correctly. Therefore on the stack frame for start_secondary an invalid return address is stored. This leads to wrong stack back traces in kernel dumps. Example: #00 [1f33fe48] cpu_idle at 10614a #01 [1f33fe90] start_secondary at 54fa88 #02 [1f33feb8] (null) at 0 <--- invalid To fix this start_secondary() is called now with basr/brasl that sets register 14 correctly. The output of the stack backtrace looks then like the following: #00 [1f33fe48] cpu_idle at 10614a #01 [1f33fe90] start_secondary at 54fa88 #02 [1f33feb8] restart_base at 54f41e <--- correct Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | [S390] dasd: prevent IO error during reserve/release loopStefan Haberland2011-05-101-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The termination of running CQR caused by reserve/release operations may lead to an IO error if reserve/release is done in a tight loop. Prevent this by increasing the retry counter after termination. Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | [S390] sclp/memory hotplug: fix initial usecount of incrementsHeiko Carstens2011-05-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix initial usecount of attached and assigned storage increments so they can be set offline. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | Revert "Bluetooth: fix shutdown on SCO sockets"Linus Torvalds2011-05-121-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit f21ca5fff6e548833fa5ee8867239a8378623150. Quoth Gustavo F. Padovan: "Commit f21ca5fff6e548833fa5ee8867239a8378623150 can cause a NULL dereference if we call shutdown in a bluetooth SCO socket and doesn't wait the shutdown completion to call close(). Please revert it. I may have a fix for it soon, but we don't have time anymore, so revert is the way to go. ;)" Requested-by: Gustavo F. Padovan <padovan@profusion.mobi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'pm-fixes' of ↵Linus Torvalds2011-05-122-3/+6
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PM / Hibernate: Fix ioctl SNAPSHOT_S2RAM PM / Hibernate: Make snapshot_release() restore GFP mask PM: Fix warning in pm_restrict_gfp_mask() during SNAPSHOT_S2RAM ioctl
| * | | | PM / Hibernate: Fix ioctl SNAPSHOT_S2RAMRafael J. Wysocki2011-05-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SNAPSHOT_S2RAM ioctl used for implementing the feature allowing one to suspend to RAM after creating a hibernation image is currently broken, because it doesn't clear the "ready" flag in the struct snapshot_data object handled by it. As a result, the SNAPSHOT_UNFREEZE doesn't work correctly after SNAPSHOT_S2RAM has returned and the user space hibernate task cannot thaw the other processes as appropriate. Make SNAPSHOT_S2RAM clear data->ready to fix this problem. Tested-by: Alexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org
| * | | | PM / Hibernate: Make snapshot_release() restore GFP maskRafael J. Wysocki2011-05-111-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the process using the hibernate user space interface closes /dev/snapshot after creating a hibernation image without thawing tasks, snapshot_release() should call pm_restore_gfp_mask() to restore the GFP mask used before the creation of the image. Make that happen. Tested-by: Alexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org
| * | | | PM: Fix warning in pm_restrict_gfp_mask() during SNAPSHOT_S2RAM ioctlRafael J. Wysocki2011-05-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A warning is printed by pm_restrict_gfp_mask() while the SNAPSHOT_S2RAM ioctl is being executed after creating a hibernation image, because pm_restrict_gfp_mask() has been called once already before the image creation and suspend_devices_and_enter() calls it once again. This happens after commit 452aa6999e6703ffbddd7f6ea124d3 (mm/pm: force GFP_NOIO during suspend/hibernation and resume). To avoid this issue, move pm_restrict_gfp_mask() and pm_restore_gfp_mask() from suspend_devices_and_enter() to its caller in kernel/power/suspend.c. Reported-by: Alexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org
* | | | | mm: tracing: add missing GFP flags to tracingMel Gorman2011-05-121-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | include/linux/gfp.h and include/trace/events/gfpflags.h are out of sync. When tracing is enabled, certain flags are not recognised and the text output is less useful as a result. Add the missing flags. Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | tmpfs: fix spurious ENOSPC when racing with unswapHugh Dickins2011-05-121-10/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Testing the shmem_swaplist replacements for igrab() revealed another bug: writes to /dev/loop0 on a tmpfs file which fills its filesystem were sometimes failing with "Buffer I/O error"s. These came from ENOSPC failures of shmem_getpage(), when racing with swapoff: the same could happen when racing with another shmem_getpage(), pulling the page in from swap in between our find_lock_page() and our taking the info->lock (though not in the single-threaded loop case). This is unacceptable, and surprising that I've not noticed it before: it dates back many years, but (presumably) was made a lot easier to reproduce in 2.6.36, which sited a page preallocation in the race window. Fix it by rechecking the page cache before settling on an ENOSPC error. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | tmpfs: fix race between umount and swapoffHugh Dickins2011-05-121-45/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The use of igrab() in swapoff's shmem_unuse_inode() is just as vulnerable to umount as that in shmem_writepage(). Fix this instance by extending the protection of shmem_swaplist_mutex right across shmem_unuse_inode(): while it's on the list, the inode cannot be evicted (and the filesystem cannot be unmounted) without shmem_evict_inode() taking that mutex to remove it from the list. But since shmem_writepage() might take that mutex, we should avoid making memory allocations or memcg charges while holding it: prepare them at the outer level in shmem_unuse(). When mem_cgroup_cache_charge() was originally placed, we didn't know until that point that the page from swap was actually a shmem page; but nowadays it's noted in the swap_map, so we're safe to charge upfront. For the radix_tree, do as is done in shmem_getpage(): preload upfront, but don't pin to the cpu; so we make a habit of refreshing the node pool, but might dip into GFP_NOWAIT reserves on occasion if subsequently preempted. With the allocation and charge moved out from shmem_unuse_inode(), we can also hold index map and info->lock over from finding the entry. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | tmpfs: fix race between umount and writepageHugh Dickins2011-05-121-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Konstanin Khlebnikov reports that a dangerous race between umount and shmem_writepage can be reproduced by this script: for i in {1..300} ; do mkdir $i while true ; do mount -t tmpfs none $i dd if=/dev/zero of=$i/test bs=1M count=$(($RANDOM % 100)) umount $i done & done on a 6xCPU node with 8Gb RAM: kernel very unstable after this accident. =) Kernel log: VFS: Busy inodes after unmount of tmpfs. Self-destruct in 5 seconds. Have a nice day... WARNING: at lib/list_debug.c:53 __list_del_entry+0x8d/0x98() list_del corruption. prev->next should be ffff880222fdaac8, but was (null) Pid: 11222, comm: mount.tmpfs Not tainted 2.6.39-rc2+ #4 Call Trace: warn_slowpath_common+0x80/0x98 warn_slowpath_fmt+0x41/0x43 __list_del_entry+0x8d/0x98 evict+0x50/0x113 iput+0x138/0x141 ... BUG: unable to handle kernel paging request at ffffffffffffffff IP: shmem_free_blocks+0x18/0x4c Pid: 10422, comm: dd Tainted: G W 2.6.39-rc2+ #4 Call Trace: shmem_recalc_inode+0x61/0x66 shmem_writepage+0xba/0x1dc pageout+0x13c/0x24c shrink_page_list+0x28e/0x4be shrink_inactive_list+0x21f/0x382 ... shmem_writepage() calls igrab() on the inode for the page which came from page reclaim, to add it later into shmem_swaplist for swapoff operation. This igrab() can race with super-block deactivating process: shrink_inactive_list() deactivate_super() pageout() tmpfs_fs_type->kill_sb() shmem_writepage() kill_litter_super() generic_shutdown_super() evict_inodes() igrab() atomic_read(&inode->i_count) skip-inode iput() if (!list_empty(&sb->s_inodes)) printk("VFS: Busy inodes after... This igrap-iput pair was added in commit 1b1b32f2c6f6 "tmpfs: fix shmem_swaplist races" based on incorrect assumptions: igrab() protects the inode from concurrent eviction by deletion, but it does nothing to protect it from concurrent unmounting, which goes ahead despite the raised i_count. So this use of igrab() was wrong all along, but the race made much worse in 2.6.37 when commit 63997e98a3be "split invalidate_inodes()" replaced two attempts at invalidate_inodes() by a single evict_inodes(). Konstantin posted a plausible patch, raising sb->s_active too: I'm unsure whether it was correct or not; but burnt once by igrab(), I am sure that we don't want to rely more deeply upon externals here. Fix it by adding the inode to shmem_swaplist earlier, while the page lock on page in page cache still secures the inode against eviction, without artifically raising i_count. It was originally added later because shmem_unuse_inode() is liable to remove an inode from the list while it's unswapped; but we can guard against that by taking spinlock before dropping mutex. Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Hugh Dickins <hughd@google.com> Tested-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | memcg: allocate memory cgroup structures in local nodesAndi Kleen2011-05-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit dde79e005a769 ("page_cgroup: reduce allocation overhead for page_cgroup array for CONFIG_SPARSEMEM") added a regression that the memory cgroup data structures all end up in node 0 because the first attempt at allocating them would not pass in a node hint. Since the initialization runs on CPU #0 it would all end up node 0. This is a problem on large memory systems, where node 0 would lose a lot of memory. Change the alloc_pages_exact() to alloc_pages_exact_nid(). This will still fall back to other nodes if not enough memory is available. [ RED-PEN: right now it would fall back first before trying vmalloc_node. Probably not the best strategy ... But I left it like that for now. ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Reported-by: Doug Nelson Cc: David Rientjes <rientjes@google.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | mm: add alloc_pages_exact_nid()Andi Kleen2011-05-122-12/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a alloc_pages_exact_nid() that allocates on a specific node. The naming is quite broken, but fixing that would need a larger renaming action. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: tweak comment] Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | MAINTAINERS: fix sortingHarry Wei2011-05-121-40/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Take alphabetical orders for MAINTAINERS file. Signed-off-by: Harry Wei <harryxiyou@gmail.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | mm: use alloc_bootmem_node_nopanic() on really needed pathYinghai Lu2011-05-122-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stefan found nobootmem does not work on his system that has only 8M of RAM. This causes an early panic: BIOS-provided physical RAM map: BIOS-88: 0000000000000000 - 000000000009f000 (usable) BIOS-88: 0000000000100000 - 0000000000840000 (usable) bootconsole [earlyser0] enabled Notice: NX (Execute Disable) protection missing in CPU or disabled in BIOS! DMI not present or invalid. last_pfn = 0x840 max_arch_pfn = 0x100000 init_memory_mapping: 0000000000000000-0000000000840000 8MB LOWMEM available. mapped low ram: 0 - 00840000 low ram: 0 - 00840000 Zone PFN ranges: DMA 0x00000001 -> 0x00001000 Normal empty Movable zone start PFN for each node early_node_map[2] active PFN ranges 0: 0x00000001 -> 0x0000009f 0: 0x00000100 -> 0x00000840 BUG: Int 6: CR2 (null) EDI c034663c ESI (null) EBP c0329f38 ESP c0329ef4 EBX c0346380 EDX 00000006 ECX ffffffff EAX fffffff4 err (null) EIP c0353191 CS c0320060 flg 00010082 Stack: (null) c030c533 000007cd (null) c030c533 00000001 (null) (null) 00000003 0000083f 00000018 00000002 00000002 c0329f6c c03534d6 (null) (null) 00000100 00000840 (null) c0329f64 00000001 00001000 (null) Pid: 0, comm: swapper Not tainted 2.6.36 #5 Call Trace: [<c02e3707>] ? 0xc02e3707 [<c035e6e5>] 0xc035e6e5 [<c0353191>] ? 0xc0353191 [<c03534d6>] 0xc03534d6 [<c034f1cd>] 0xc034f1cd [<c034a824>] 0xc034a824 [<c03513cb>] ? 0xc03513cb [<c0349432>] 0xc0349432 [<c0349066>] 0xc0349066 It turns out that we should ignore the low limit of 16M. Use alloc_bootmem_node_nopanic() in this case. [akpm@linux-foundation.org: less mess] Signed-off-by: Yinghai LU <yinghai@kernel.org> Reported-by: Stefan Hellermann <stefan@the2masters.de> Tested-by: Stefan Hellermann <stefan@the2masters.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org> [2.6.34+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | mm: check PageUnevictable in lru_deactivate_fn()Minchan Kim2011-05-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The lru_deactivate_fn should not move page which in on unevictable lru into inactive list. Otherwise, we can meet BUG when we use isolate_lru_pages as __isolate_lru_page could return -EINVAL. Reported-by: Ying Han <yinghan@google.com> Tested-by: Ying Han <yinghan@google.com> Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: Rik van Riel<riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | drivers/rtc/rtc-s3c.c: fixup wake support for rtcBen Dooks2011-05-121-3/+10
| |_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The driver is not balancing set_irq and disable_irq_wake() calls, so ensure that it keeps track of whether the wake is enabled. The fixes the following error on S3C6410 devices: WARNING: at kernel/irq/manage.c:382 set_irq_wake+0x84/0xec() Unbalanced IRQ 92 wake disable Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2011-05-1140-232/+422
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits) slcan: fix ldisc->open retval net/usb: mark LG VL600 LTE modem ethernet interface as WWAN xfrm: Don't allow esn with disabled anti replay detection xfrm: Assign the inner mode output function to the dst entry net: dev_close() should check IFF_UP vlan: fix GVRP at dismantle time netfilter: revert a2361c8735e07322023aedc36e4938b35af31eb0 netfilter: IPv6: fix DSCP mangle code netfilter: IPv6: initialize TOS field in REJECT target module IPVS: init and cleanup restructuring IPVS: Change of socket usage to enable name space exit. netfilter: ebtables: only call xt_compat_add_offset once per rule netfilter: fix ebtables compat support netfilter: ctnetlink: fix timestamp support for new conntracks pch_gbe: support ML7223 IOH PCH_GbE : Fixed the issue of checksum judgment PCH_GbE : Fixed the issue of collision detection NET: slip, fix ldisc->open retval be2net: Fixed bugs related to PVID. ehea: fix wrongly reported speed and port ...
| * \ \ \ Merge branch 'pablo/nf-2.6-updates' of git://1984.lsi.us.es/net-2.6David S. Miller2011-05-1114-161/+279
| |\ \ \ \
| | * | | | netfilter: revert a2361c8735e07322023aedc36e4938b35af31eb0Pablo Neira Ayuso2011-05-101-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch reverts a2361c8735e07322023aedc36e4938b35af31eb0: "[PATCH] netfilter: xt_conntrack: warn about use in raw table" Florian Wesphal says: "... when the packet was sent from the local machine the skb already has ->nfct attached, and -m conntrack seems to do the right thing." Acked-by: Jan Engelhardt <jengelh@medozas.de> Reported-by: Florian Wesphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: IPv6: fix DSCP mangle codeFernando Luis Vazquez Cao2011-05-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The mask indicates the bits one wants to zero out, so it needs to be inverted before applying to the original TOS field. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: IPv6: initialize TOS field in REJECT target moduleFernando Luis Vazquez Cao2011-05-101-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IPv6 header is not zeroed out in alloc_skb so we must initialize it properly unless we want to see IPv6 packets with random TOS fields floating around. The current implementation resets the flow label but this could be changed if deemed necessary. We stumbled upon this issue when trying to apply a mangle rule to the RST packet generated by the REJECT target module. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | IPVS: init and cleanup restructuringHans Schillstrom2011-05-108-80/+223
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DESCRIPTION This patch tries to restore the initial init and cleanup sequences that was before namspace patch. Netns also requires action when net devices unregister which has never been implemented. I.e this patch also covers when a device moves into a network namespace, and has to be released. IMPLEMENTATION The number of calls to register_pernet_device have been reduced to one for the ip_vs.ko Schedulers still have their own calls. This patch adds a function __ip_vs_service_cleanup() and an enable flag for the netfilter hooks. The nf hooks will be enabled when the first service is loaded and never disabled again, except when a namespace exit starts. Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Acked-by: Julian Anastasov <ja@ssi.bg> [horms@verge.net.au: minor edit to changelog] Signed-off-by: Simon Horman <horms@verge.net.au>