summaryrefslogtreecommitdiffstats
path: root/init/main.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* init/main.c: make setup_max_cpus static for !SMPH Hartley Sweeten2010-03-061-1/+1
| | | | | | | | | | | | | | | | The only in tree external users of the symbol setup_max_cpus are in arch/x86/. The files ./kernel/alternative.c, ./kernel/visws_quirks.c, and ./mm/kmemcheck/kmemcheck.c are all guarded by CONFIG_SMP being defined. For this case the symbol is an unsigned int and declared as an extern in include/linux/smp.h. When CONFIG_SMP is not defined the symbol setup_max_cpus is a constant value that is only used in init/main.c. Make the symbol static for this case. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* init/main.c: improve usability in case of init binary failureAndreas Mohr2010-03-061-1/+2
| | | | | | | | | | | | - new Documentation/init.txt file describing various forms of failure trying to load the init binary after kernel bootup - extend the init/main.c init failure message to direct to Documentation/init.txt Signed-off-by: Andreas Mohr <andi@lisas.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm/pm: force GFP_NOIO during suspend/hibernation and resumeRafael J. Wysocki2010-03-061-1/+1
| | | | | | | | | | | | | | | | | | | | There are quite a few GFP_KERNEL memory allocations made during suspend/hibernation and resume that may cause the system to hang, because the I/O operations they depend on cannot be completed due to the underlying devices being suspended. Avoid this problem by clearing the __GFP_IO and __GFP_FS bits in gfp_allowed_mask before suspend/hibernation and restoring the original values of these bits in gfp_allowed_mask durig the subsequent resume. [akpm@linux-foundation.org: fix CONFIG_PM=n linkage] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-by: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2010-03-041-5/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits) init: Open /dev/console from rootfs mqueue: fix typo "failues" -> "failures" mqueue: only set error codes if they are really necessary mqueue: simplify do_open() error handling mqueue: apply mathematics distributivity on mq_bytes calculation mqueue: remove unneeded info->messages initialization mqueue: fix mq_open() file descriptor leak on user-space processes fix race in d_splice_alias() set S_DEAD on unlink() and non-directory rename() victims vfs: add NOFOLLOW flag to umount(2) get rid of ->mnt_parent in tomoyo/realpath hppfs can use existing proc_mnt, no need for do_kern_mount() in there Mirror MS_KERNMOUNT in ->mnt_flags get rid of useless vfsmount_lock use in put_mnt_ns() Take vfsmount_lock to fs/internal.h get rid of insanity with namespace roots in tomoyo take check for new events in namespace (guts of mounts_poll()) to namespace.c Don't mess with generic_permission() under ->d_lock in hpfs sanitize const/signedness for udf nilfs: sanitize const/signedness in dealing with ->d_name.name ... Fix up fairly trivial (famous last words...) conflicts in drivers/infiniband/core/uverbs_main.c and security/tomoyo/realpath.c
| * init: Open /dev/console from rootfsEric W. Biederman2010-03-031-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To avoid potential problems with an empty /dev open /dev/console from rootfs instead of waiting to mount our root filesystem and mounting it there. This effectively guarantees that there will be a device node, and it won't be on a filesystem that we will ever unmount, so there are no issues with leaving /dev/console open and pinning the filesystem. This is actually more effective than automatically mounting devtmpfs on /dev because it removes removes the occasionally problematic assumption that /dev/console exists from the boot code. With this patch I was able to throw busybox on my /boot partition (which has no /dev directory) and boot into userspace without problems. The only possible negative consequence I can think of is that someone out there deliberately used did not use a character device that is major 5 minor 2 for /dev/console. Does anyone know of a situation in which that could make sense? Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds2010-03-031-1/+15
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits) x86: Fix out of order of gsi x86: apic: Fix mismerge, add arch_probe_nr_irqs() again x86, irq: Keep chip_data in create_irq_nr and destroy_irq xen: Remove unnecessary arch specific xen irq functions. smp: Use nr_cpus= to set nr_cpu_ids early x86, irq: Remove arch_probe_nr_irqs sparseirq: Use radix_tree instead of ptrs array sparseirq: Change irq_desc_ptrs to static init: Move radix_tree_init() early irq: Remove unnecessary bootmem code x86: Add iMac9,1 to pci_reboot_dmi_table x86: Convert i8259_lock to raw_spinlock x86: Convert nmi_lock to raw_spinlock x86: Convert ioapic_lock and vector_lock to raw_spinlock x86: Avoid race condition in pci_enable_msix() x86: Fix SCI on IOAPIC != 0 x86, ia32_aout: do not kill argument mapping x86, irq: Move __setup_vector_irq() before the first irq enable in cpu online path x86, irq: Update the vector domain for legacy irqs handled by io-apic x86, irq: Don't block IRQ0_VECTOR..IRQ15_VECTOR's on all cpu's ...
| * | smp: Use nr_cpus= to set nr_cpu_ids earlyYinghai Lu2010-02-181-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On x86, before prefill_possible_map(), nr_cpu_ids will be NR_CPUS aka CONFIG_NR_CPUS. Add nr_cpus= to set nr_cpu_ids. so we can simulate cpus <=8 are installed on normal config. -v2: accordging to Christoph, acpi_numa_init should use nr_cpu_ids in stead of NR_CPUS. -v3: add doc in kernel-parameters.txt according to Andrew. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <1265793639-15071-34-git-send-email-yinghai@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Tony Luck <tony.luck@intel.com>
| * | init: Move radix_tree_init() earlyYinghai Lu2010-02-181-1/+1
| |/ | | | | | | | | | | | | | | Prepare for using radix trees in early_irq_init(). Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <1265793639-15071-30-git-send-email-yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* / sched: Use lockdep-based checking on rcu_dereference()Paul E. McKenney2010-02-251-0/+2
|/ | | | | | | | | | | | | | | | | | | | Update the rcu_dereference() usages to take advantage of the new lockdep-based checking. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1266887105-1528-6-git-send-email-paulmck@linux.vnet.ibm.com> [ -v2: fix allmodconfig missing symbol export build failure on x86 ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
* ima: initialize ima before inodes can be allocatedEric Paris2010-02-071-1/+1
| | | | | | | | | | | | | | ima wants to create an inode information struct (iint) when inodes are allocated. This means that at least the part of ima which does this allocation (the allocation is filled with information later) should before any inodes are created. To accomplish this we split the ima initialization routine placing the kmem cache allocator inside a security_initcall() function. Since this makes use of radix trees we also need to make sure that is initialized before security_initcall(). Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* sched: Mark boot-cpu active before smp_init()Peter Zijlstra2009-12-161-6/+1
| | | | | | | | | | A UP machine has 1 active cpu, not having the boot-cpu in the active map when starting the scheduler confuses things. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170517.423469527@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* init/main.c: fix symbol shadows noiseH Hartley Sweeten2009-12-151-9/+9
| | | | | | | | | | | | | | The symbol 'call' is a static symbol used for initcall_debug. This same symbol name is used locally by a couple functions and produces the following sparse warnings: warning: symbol 'call' shadows an earlier one Fix this noise by renaming the local symbols. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* param: don't complain about unused module parameters.Rusty Russell2009-12-031-8/+3
| | | | | | | | | | | | Jon confirms that recent modprobe will look in /proc/cmdline, so these cmdline options can still be used. See http://bugzilla.kernel.org/show_bug.cgi?id=14164 Reported-by: Adam Williamson <awilliam@redhat.com> Cc: stable@kernel.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds2009-10-081-1/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: futex: fix requeue_pi key imbalance futex: Fix typo in FUTEX_WAIT/WAKE_BITSET_PRIVATE definitions rcu: Place root rcu_node structure in separate lockdep class rcu: Make hot-unplugged CPU relinquish its own RCU callbacks rcu: Move rcu_barrier() to rcutree futex: Move exit_pi_state() call to release_mm() futex: Nullify robust lists after cleanup futex: Fix locking imbalance panic: Fix panic message visibility by calling bust_spinlocks(0) before dying rcu: Replace the rcu_barrier enum with pointer to call_rcu*() function rcu: Clean up code based on review feedback from Josh Triplett, part 4 rcu: Clean up code based on review feedback from Josh Triplett, part 3 rcu: Fix rcu_lock_map build failure on CONFIG_PROVE_LOCKING=y rcu: Clean up code to address Ingo's checkpatch feedback rcu: Clean up code based on review feedback from Josh Triplett, part 2 rcu: Clean up code based on review feedback from Josh Triplett
| * rcu: Clean up code based on review feedback from Josh Triplett, part 2Paul E. McKenney2009-09-231-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These issues identified during an old-fashioned face-to-face code review extending over many hours. o Add comments for tricky parts of code, and correct comments that have passed their sell-by date. o Get rid of the vestiges of rcu_init_sched(), which is no longer needed now that PREEMPT_RCU is gone. o Move the #include of rcutree_plugin.h to the end of rcutree.c, which means that, rather than having a random collection of forward declarations, the new set of forward declarations document the set of plugins. The new home for this #include also allows __rcu_init_preempt() to move into rcutree_plugin.h. o Fix rcu_preempt_check_callbacks() to be static. Suggested-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: akpm@linux-foundation.org Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12537246443924-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu> Peter Zijlstra <peterz@infradead.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linusLinus Torvalds2009-09-241-5/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (39 commits) cpumask: Move deprecated functions to end of header. cpumask: remove unused deprecated functions, avoid accusations of insanity cpumask: use new-style cpumask ops in mm/quicklist. cpumask: use mm_cpumask() wrapper: x86 cpumask: use mm_cpumask() wrapper: um cpumask: use mm_cpumask() wrapper: mips cpumask: use mm_cpumask() wrapper: mn10300 cpumask: use mm_cpumask() wrapper: m32r cpumask: use mm_cpumask() wrapper: arm cpumask: Use accessors for cpu_*_mask: um cpumask: Use accessors for cpu_*_mask: powerpc cpumask: Use accessors for cpu_*_mask: mips cpumask: Use accessors for cpu_*_mask: m32r cpumask: remove arch_send_call_function_ipi cpumask: arch_send_call_function_ipi_mask: s390 cpumask: arch_send_call_function_ipi_mask: powerpc cpumask: arch_send_call_function_ipi_mask: mips cpumask: arch_send_call_function_ipi_mask: m32r cpumask: arch_send_call_function_ipi_mask: alpha cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: ia64 ...
| * | cpumask: remove unused cpu_mask_allRusty Russell2009-09-241-5/+0
| | | | | | | | | | | | | | | | | | | | | It's only defined for NR_CPUS > BITS_PER_LONG; cpu_all_mask is always defined (and const). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | | headers: utsname.h reduxAlexey Dobriyan2009-09-241-1/+0
|/ / | | | | | | | | | | | | | | | | | | | | | | | | * remove asm/atomic.h inclusion from linux/utsname.h -- not needed after kref conversion * remove linux/utsname.h inclusion from files which do not need it NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however due to some personality stuff it _is_ needed -- cowardly leave ELF-related headers and files alone. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'sfi-release' of ↵Linus Torvalds2009-09-231-0/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-sfi-2.6 * 'sfi-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-sfi-2.6: SFI: remove unneeded includes sfi: Remove unused code SFI: Hook PCI MMCONFIG x86: add arch-specific SFI support SFI: add capability to parse ACPI tables SFI: add platform-independent core support SFI: create linux/sfi.h SFI: Simple Firmware Interface - MAINTAINERS, Kconfig
| * | Merge branch 'linus' into sfi-releaseLen Brown2009-09-191-26/+4
| |\| | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/kernel/setup.c drivers/acpi/power.c init/main.c Signed-off-by: Len Brown <len.brown@intel.com>
| * | SFI: add platform-independent core supportFeng Tang2009-08-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | drivers/sfi/sfi_core.c contains the generic SFI implementation. It has a private header, sfi_core.h, for its own use and the private use of future files in drivers/sfi/ Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
* | | mm: replace various uses of num_physpages by totalram_pagesJan Beulich2009-09-221-2/+2
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sizing of memory allocations shouldn't depend on the number of physical pages found in a system, as that generally includes (perhaps a huge amount of) non-RAM pages. The amount of what actually is usable as storage should instead be used as a basis here. Some of the calculations (i.e. those not intending to use high memory) should likely even use (totalram_pages - totalhigh_pages). Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Dave Airlie <airlied@linux.ie> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: "David S. Miller" <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6Linus Torvalds2009-09-161-0/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: Driver Core: devtmpfs - kernel-maintained tmpfs-based /dev debugfs: Modify default debugfs directory for debugging pktcdvd. debugfs: Modified default dir of debugfs for debugging UHCI. debugfs: Change debugfs directory of IWMC3200 debugfs: Change debuhgfs directory of trace-events-sample.h debugfs: Fix mount directory of debugfs by default in events.txt hpilo: add poll f_op hpilo: add interrupt handler hpilo: staging for interrupt handling driver core: platform_device_add_data(): use kmemdup() Driver core: Add support for compatibility classes uio: add generic driver for PCI 2.3 devices driver-core: move dma-coherent.c from kernel to driver/base mem_class: fix bug mem_class: use minor as index instead of searching the array driver model: constify attribute groups UIO: remove 'default n' from Kconfig Driver core: Add accessor for device platform data Driver core: move dev_get/set_drvdata to drivers/base/dd.c Driver core: add new device to bus's list before probing
| * | Driver Core: devtmpfs - kernel-maintained tmpfs-based /devKay Sievers2009-09-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Devtmpfs lets the kernel create a tmpfs instance called devtmpfs very early at kernel initialization, before any driver-core device is registered. Every device with a major/minor will provide a device node in devtmpfs. Devtmpfs can be changed and altered by userspace at any time, and in any way needed - just like today's udev-mounted tmpfs. Unmodified udev versions will run just fine on top of it, and will recognize an already existing kernel-created device node and use it. The default node permissions are root:root 0600. Proper permissions and user/group ownership, meaningful symlinks, all other policy still needs to be applied by userspace. If a node is created by devtmps, devtmpfs will remove the device node when the device goes away. If the device node was created by userspace, or the devtmpfs created node was replaced by userspace, it will no longer be removed by devtmpfs. If it is requested to auto-mount it, it makes init=/bin/sh work without any further userspace support. /dev will be fully populated and dynamic, and always reflect the current device state of the kernel. With the commonly used dynamic device numbers, it solves the problem where static devices nodes may point to the wrong devices. It is intended to make the initial bootup logic simpler and more robust, by de-coupling the creation of the inital environment, to reliably run userspace processes, from a complex userspace bootstrap logic to provide a working /dev. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Jan Blunck <jblunck@suse.de> Tested-By: Harald Hoyer <harald@redhat.com> Tested-By: Scott James Remnant <scott@ubuntu.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2009-09-151-24/+0
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits) powerpc64: convert to dynamic percpu allocator sparc64: use embedding percpu first chunk allocator percpu: kill lpage first chunk allocator x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA percpu: update embedding first chunk allocator to handle sparse units percpu: use group information to allocate vmap areas sparsely vmalloc: implement pcpu_get_vm_areas() vmalloc: separate out insert_vmalloc_vm() percpu: add chunk->base_addr percpu: add pcpu_unit_offsets[] percpu: introduce pcpu_alloc_info and pcpu_group_info percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward percpu: add @align to pcpu_fc_alloc_fn_t percpu: make @dyn_size mandatory for pcpu_setup_first_chunk() percpu: drop @static_size from first chunk allocators percpu: generalize first chunk allocator selection percpu: build first chunk allocators selectively percpu: rename 4k first chunk allocator to page percpu: improve boot messages percpu: fix pcpu_reclaim() locking ... Fix trivial conflict as by Tejun Heo in kernel/sched.c
| * | Merge branch 'percpu-for-linus' into percpu-for-nextTejun Heo2009-08-141-1/+1
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/sparc/kernel/smp_64.c arch/x86/kernel/cpu/perf_counter.c arch/x86/kernel/setup_percpu.c drivers/cpufreq/cpufreq_ondemand.c mm/percpu.c Conflicts in core and arch percpu codes are mostly from commit ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all the first chunk allocators into mm/percpu.c, the changes are moved from arch code to mm/percpu.c. Signed-off-by: Tejun Heo <tj@kernel.org>
| * \ \ Merge branch 'master' into for-nextTejun Heo2009-07-041-6/+1
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull linus#master to merge PER_CPU_DEF_ATTRIBUTES and alpha build fix changes. As alpha in percpu tree uses 'weak' attribute instead of inline assembly, there's no need for __used attribute. Conflicts: arch/alpha/include/asm/percpu.h arch/mn10300/kernel/vmlinux.lds.S include/linux/percpu-defs.h
| * | | | percpu: use dynamic percpu allocator as the default percpu allocatorTejun Heo2009-06-241-24/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes most !CONFIG_HAVE_SETUP_PER_CPU_AREA archs use dynamic percpu allocator. The first chunk is allocated using embedding helper and 8k is reserved for modules. This ensures that the new allocator behaves almost identically to the original allocator as long as static percpu variables are concerned, so it shouldn't introduce much breakage. s390 and alpha use custom SHIFT_PERCPU_PTR() to work around addressing range limit the addressing model imposes. Unfortunately, this breaks if the address is specified using a variable, so for now, the two archs aren't converted. The following architectures are affected by this change. * sh * arm * cris * mips * sparc(32) * blackfin * avr32 * parisc (broken, under investigation) * m32r * powerpc(32) As this change makes the dynamic allocator the default one, CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is replaced with its invert - CONFIG_HAVE_LEGACY_PER_CPU_AREA, which is added to yet-to-be converted archs. These archs implement their own setup_per_cpu_areas() and the conversion is not trivial. * powerpc(64) * sparc(64) * ia64 * alpha * s390 Boot and batch alloc/free tests on x86_32 with debug code (x86_32 doesn't use default first chunk initialization). Compile tested on sparc(32), powerpc(32), arm and alpha. Kyle McMartin reported that this change breaks parisc. The problem is still under investigation and he is okay with pushing this patch forward and fixing parisc later. [ Impact: use dynamic allocator for most archs w/o custom percpu setup ] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: Christoph Lameter <cl@linux.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Mikael Starvik <starvik@axis.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Bryan Wu <cooloney@kernel.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu>
* | | | | Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2009-09-111-1/+1
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (64 commits) sched: Fix sched::sched_stat_wait tracepoint field sched: Disable NEW_FAIR_SLEEPERS for now sched: Keep kthreads at default priority sched: Re-tune the scheduler latency defaults to decrease worst-case latencies sched: Turn off child_runs_first sched: Ensure that a child can't gain time over it's parent after fork() sched: enable SD_WAKE_IDLE sched: Deal with low-load in wake_affine() sched: Remove short cut from select_task_rq_fair() sched: Turn on SD_BALANCE_NEWIDLE sched: Clean up topology.h sched: Fix dynamic power-balancing crash sched: Remove reciprocal for cpu_power sched: Try to deal with low capacity, fix update_sd_power_savings_stats() sched: Try to deal with low capacity sched: Scale down cpu_power due to RT tasks sched: Implement dynamic cpu_power sched: Add smt_gain sched: Update the cpu_power sum during load-balance sched: Add SD_PREFER_SIBLING ...
| * | | | | init: Move sched_clock_init after late_time_initThomas Gleixner2009-08-271-1/+1
| | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some architectures initialize clocks and timers in late_time_init and x86 wants to do the same to avoid FIXMAP hackery for calibrating the TSC. That would result in undefined sched_clock readout and wreckaged printk timestamps again. We probably have those already on archs which do all their time/clock setup in late_time_init. There is no harm to move that after late_time_init except that a few more boot timestamps are stale. The scheduler is not active at that point so no real wreckage is expected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission> Cc: linux-arch@vger.kernel.org
* / | | | rcu: Move end of special early-boot RCU operation earlierPaul E. McKenney2009-09-041-1/+1
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ingo was getting warnings from rcu_scheduler_starting() indicating that context switches had occurred before RCU ended its special early-boot handling of grace periods. This is a dangerous condition, as it indicates that RCU might have prematurely ended grace periods. This exploratory fix moves rcu_scheduler_starting() earlier in boot. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds2009-08-251-3/+4
|\ \ \ \ | |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Fix too large stack usage in do_one_initcall() tracing: handle broken names in ftrace filter ftrace: Unify effect of writing to trace_options and option/*
| * | | tracing: Fix too large stack usage in do_one_initcall()Ingo Molnar2009-08-211-3/+4
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of my testboxes triggered this nasty stack overflow crash during SCSI probing: [ 5.874004] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5.875004] device: 'sda': device_add [ 5.878004] BUG: unable to handle kernel NULL pointer dereference at 00000a0c [ 5.878004] IP: [<b1008321>] print_context_stack+0x81/0x110 [ 5.878004] *pde = 00000000 [ 5.878004] Thread overran stack, or stack corrupted [ 5.878004] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 5.878004] last sysfs file: [ 5.878004] [ 5.878004] Pid: 1, comm: swapper Not tainted (2.6.31-rc6-tip-01272-g9919e28-dirty #5685) [ 5.878004] EIP: 0060:[<b1008321>] EFLAGS: 00010083 CPU: 0 [ 5.878004] EIP is at print_context_stack+0x81/0x110 [ 5.878004] EAX: cf8a3000 EBX: cf8a3fe4 ECX: 00000049 EDX: 00000000 [ 5.878004] ESI: b1cfce84 EDI: 00000000 EBP: cf8a3018 ESP: cf8a2ff4 [ 5.878004] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 5.878004] Process swapper (pid: 1, ti=cf8a2000 task=cf8a8000 task.ti=cf8a3000) [ 5.878004] Stack: [ 5.878004] b1004867 fffff000 cf8a3ffc [ 5.878004] Call Trace: [ 5.878004] [<b1004867>] ? kernel_thread_helper+0x7/0x10 [ 5.878004] BUG: unable to handle kernel NULL pointer dereference at 00000a0c [ 5.878004] IP: [<b1008321>] print_context_stack+0x81/0x110 [ 5.878004] *pde = 00000000 [ 5.878004] Thread overran stack, or stack corrupted [ 5.878004] Oops: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC The oops did not reveal any more details about the real stack that we have and the system got into an infinite loop of recursive pagefaults. So i booted with CONFIG_STACK_TRACER=y and the 'stacktrace' boot parameter. The box did not crash (timings/conditions probably changed a tiny bit to trigger the catastrophic crash), but the /debug/tracing/stack_trace file was rather revealing: Depth Size Location (72 entries) ----- ---- -------- 0) 3704 52 __change_page_attr+0xb8/0x290 1) 3652 24 __change_page_attr_set_clr+0x43/0x90 2) 3628 60 kernel_map_pages+0x108/0x120 3) 3568 40 prep_new_page+0x7d/0x130 4) 3528 84 get_page_from_freelist+0x106/0x420 5) 3444 116 __alloc_pages_nodemask+0xd7/0x550 6) 3328 36 allocate_slab+0xb1/0x100 7) 3292 36 new_slab+0x1c/0x160 8) 3256 36 __slab_alloc+0x133/0x2b0 9) 3220 4 kmem_cache_alloc+0x1bb/0x1d0 10) 3216 108 create_object+0x28/0x250 11) 3108 40 kmemleak_alloc+0x81/0xc0 12) 3068 24 kmem_cache_alloc+0x162/0x1d0 13) 3044 52 scsi_pool_alloc_command+0x29/0x70 14) 2992 20 scsi_host_alloc_command+0x22/0x70 15) 2972 24 __scsi_get_command+0x1b/0x90 16) 2948 28 scsi_get_command+0x35/0x90 17) 2920 24 scsi_setup_blk_pc_cmnd+0xd4/0x100 18) 2896 128 sd_prep_fn+0x332/0xa70 19) 2768 36 blk_peek_request+0xe7/0x1d0 20) 2732 56 scsi_request_fn+0x54/0x520 21) 2676 12 __generic_unplug_device+0x2b/0x40 22) 2664 24 blk_execute_rq_nowait+0x59/0x80 23) 2640 172 blk_execute_rq+0x6b/0xb0 24) 2468 32 scsi_execute+0xe0/0x140 25) 2436 64 scsi_execute_req+0x152/0x160 26) 2372 60 scsi_vpd_inquiry+0x6c/0x90 27) 2312 44 scsi_get_vpd_page+0x112/0x160 28) 2268 52 sd_revalidate_disk+0x1df/0x320 29) 2216 92 rescan_partitions+0x98/0x330 30) 2124 52 __blkdev_get+0x309/0x350 31) 2072 8 blkdev_get+0xf/0x20 32) 2064 44 register_disk+0xff/0x120 33) 2020 36 add_disk+0x6e/0xb0 34) 1984 44 sd_probe_async+0xfb/0x1d0 35) 1940 44 __async_schedule+0xf4/0x1b0 36) 1896 8 async_schedule+0x12/0x20 37) 1888 60 sd_probe+0x305/0x360 38) 1828 44 really_probe+0x63/0x170 39) 1784 36 driver_probe_device+0x5d/0x60 40) 1748 16 __device_attach+0x49/0x50 41) 1732 32 bus_for_each_drv+0x5b/0x80 42) 1700 24 device_attach+0x6b/0x70 43) 1676 16 bus_attach_device+0x47/0x60 44) 1660 76 device_add+0x33d/0x400 45) 1584 52 scsi_sysfs_add_sdev+0x6a/0x2c0 46) 1532 108 scsi_add_lun+0x44b/0x460 47) 1424 116 scsi_probe_and_add_lun+0x182/0x4e0 48) 1308 36 __scsi_add_device+0xd9/0xe0 49) 1272 44 ata_scsi_scan_host+0x10b/0x190 50) 1228 24 async_port_probe+0x96/0xd0 51) 1204 44 __async_schedule+0xf4/0x1b0 52) 1160 8 async_schedule+0x12/0x20 53) 1152 48 ata_host_register+0x171/0x1d0 54) 1104 60 ata_pci_sff_activate_host+0xf3/0x230 55) 1044 44 ata_pci_sff_init_one+0xea/0x100 56) 1000 48 amd_init_one+0xb2/0x190 57) 952 8 local_pci_probe+0x13/0x20 58) 944 32 pci_device_probe+0x68/0x90 59) 912 44 really_probe+0x63/0x170 60) 868 36 driver_probe_device+0x5d/0x60 61) 832 20 __driver_attach+0x89/0xa0 62) 812 32 bus_for_each_dev+0x5b/0x80 63) 780 12 driver_attach+0x1e/0x20 64) 768 72 bus_add_driver+0x14b/0x2d0 65) 696 36 driver_register+0x6e/0x150 66) 660 20 __pci_register_driver+0x53/0xc0 67) 640 8 amd_init+0x14/0x16 68) 632 572 do_one_initcall+0x2b/0x1d0 69) 60 12 do_basic_setup+0x56/0x6a 70) 48 20 kernel_init+0x84/0xce 71) 28 28 kernel_thread_helper+0x7/0x10 There's a lot of fat functions on that stack trace, but the largest of all is do_one_initcall(). This is due to the boot trace entry variables being on the stack. Fixing this is relatively easy, initcalls are fundamentally serialized, so we can move the local variables to file scope. Note that this large stack footprint was present for a couple of months already - what pushed my system over the edge was the addition of kmemleak to the call-chain: 6) 3328 36 allocate_slab+0xb1/0x100 7) 3292 36 new_slab+0x1c/0x160 8) 3256 36 __slab_alloc+0x133/0x2b0 9) 3220 4 kmem_cache_alloc+0x1bb/0x1d0 10) 3216 108 create_object+0x28/0x250 11) 3108 40 kmemleak_alloc+0x81/0xc0 12) 3068 24 kmem_cache_alloc+0x162/0x1d0 13) 3044 52 scsi_pool_alloc_command+0x29/0x70 This pushes the total to ~3800 bytes, only a tiny bit more was needed to corrupt the on-kernel-stack thread_info. The fix reduces the stack footprint from 572 bytes to 28 bytes. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* / | init: set nr_cpu_ids before setup_per_cpu_areas()Tejun Heo2009-08-141-1/+1
|/ / | | | | | | | | | | | | | | nr_cpu_ids is dependent only on cpu_possible_map and setup_per_cpu_areas() already depends on cpu_possible_map and will use nr_cpu_ids. Initialize nr_cpu_ids before setting up percpu areas. Signed-off-by: Tejun Heo <tj@kernel.org>
| |
| \
*-. \ Merge branches 'acerhdf', 'acpi-pci-bind', 'bjorn-pci-root', ↵Len Brown2009-06-241-5/+1
|\ \ \ | | | | | | | | | | | | 'bugzilla-12904', 'bugzilla-13121', 'bugzilla-13396', 'bugzilla-13533', 'bugzilla-13612', 'c3_lock', 'hid-cleanups', 'misc-2.6.31', 'pdc-leak-fix', 'pnpacpi', 'power_nocheck', 'thinkpad_acpi', 'video' and 'wmi' into release
| | * | ACPI: move declaration acpi_early_init() to acpi.hLen Brown2009-06-131-5/+1
| |/ / | | | | | | | | | Signed-off-by: Len Brown <len.brown@intel.com>
* | / mm/init: cpu_hotplug_init() must be initialized before SLABLinus Torvalds2009-06-231-1/+0
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SLAB uses get/put_online_cpus() which use a mutex which is itself only initialized when cpu_hotplug_init() is called. Currently we hang suring boot in SLAB due to doing that too late. Reported by James Bottomley and Sachin Sant (and possibly others). Debugged by Benjamin Herrenschmidt. This just removes the dynamic initialization of the data structures, and replaces it with a static one, avoiding this dependency entirely, and removing one unnecessary special initcall. Tested-by: Sachin Sant <sachinp@in.ibm.com> Tested-by: James Bottomley <James.Bottomley@HansenPartnership.com> Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | mm: Extend gfp masking to the page allocatorBenjamin Herrenschmidt2009-06-181-0/+4
| | | | | | | | | | | | | | | | | | | | The page allocator also needs the masking of gfp flags during boot, so this moves it out of slab/slub and uses it with the page allocator as well. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | kernel: constructor supportPeter Oberparleiter2009-06-181-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Call constructors (gcc-generated initcall-like functions) during kernel start and module load. Constructors are e.g. used for gcov data initialization. Disable constructor support for usermode Linux to prevent conflicts with host glibc. Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Li Wei <W.Li@Sun.COM> Cc: Michael Ellerman <michaele@au1.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Heiko Carstens <heicars2@linux.vnet.ibm.com> Cc: Martin Schwidefsky <mschwid2@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | mm: Move pgtable_cache_init() earlierBenjamin Herrenschmidt2009-06-171-1/+1
| | | | | | | | | | | | | | | | | | | | Some architectures need to initialize SLAB caches to be able to allocate page tables. They do that from pgtable_cache_init() so the later should be called earlier now, best is before vmalloc_init(). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'akpm'Linus Torvalds2009-06-171-1/+5
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * akpm: (182 commits) fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset fbdev: *bfin*: fix __dev{init,exit} markings fbdev: *bfin*: drop unnecessary calls to memset fbdev: bfin-t350mcqb-fb: drop unused local variables fbdev: blackfin has __raw I/O accessors, so use them in fb.h fbdev: s1d13xxxfb: add accelerated bitblt functions tcx: use standard fields for framebuffer physical address and length fbdev: add support for handoff from firmware to hw framebuffers intelfb: fix a bug when changing video timing fbdev: use framebuffer_release() for freeing fb_info structures radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb? s3c-fb: CPUFREQ frequency scaling support s3c-fb: fix resource releasing on error during probing carminefb: fix possible access beyond end of carmine_modedb[] acornfb: remove fb_mmap function mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF mb862xxfb: restrict compliation of platform driver to PPC Samsung SoC Framebuffer driver: add Alpha Channel support atmel-lcdc: fix pixclock upper bound detection offb: use framebuffer_alloc() to allocate fb_info struct ... Manually fix up conflicts due to kmemcheck in mm/slab.c
| * | cpuset,mm: update tasks' mems_allowed in timeMiao Xie2009-06-171-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix allocating page cache/slab object on the unallowed node when memory spread is set by updating tasks' mems_allowed after its cpuset's mems is changed. In order to update tasks' mems_allowed in time, we must modify the code of memory policy. Because the memory policy is applied in the process's context originally. After applying this patch, one task directly manipulates anothers mems_allowed, and we use alloc_lock in the task_struct to protect mems_allowed and memory policy of the task. But in the fast path, we didn't use lock to protect them, because adding a lock may lead to performance regression. But if we don't add a lock,the task might see no nodes when changing cpuset's mems_allowed to some non-overlapping set. In order to avoid it, we set all new allowed nodes, then clear newly disallowed ones. [lee.schermerhorn@hp.com: The rework of mpol_new() to extract the adjusting of the node mask to apply cpuset and mpol flags "context" breaks set_mempolicy() and mbind() with MPOL_PREFERRED and a NULL nodemask--i.e., explicit local allocation. Fix this by adding the check for MPOL_PREFERRED and empty node mask to mpol_new_mpolicy(). Remove the now unneeded 'nodes = NULL' from mpol_new(). Note that mpol_new_mempolicy() is always called with a non-NULL 'nodes' parameter now that it has been removed from mpol_new(). Therefore, we don't need to test nodes for NULL before testing it for 'empty'. However, just to be extra paranoid, add a VM_BUG_ON() to verify this assumption.] [lee.schermerhorn@hp.com: I don't think the function name 'mpol_new_mempolicy' is descriptive enough to differentiate it from mpol_new(). This function applies cpuset set context, usually constraining nodes to those allowed by the cpuset. However, when the 'RELATIVE_NODES flag is set, it also translates the nodes. So I settled on 'mpol_set_nodemask()', because the comment block for mpol_new() mentions that we need to call this function to "set nodes". Some additional minor line length, whitespace and typo cleanup.] Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Paul Menage <menage@google.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge commit 'linus/master' into HEADVegard Nossum2009-06-151-0/+6
|\| | | | | | | | | | | | | | | | | | | | Conflicts: MAINTAINERS Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | slab,slub: don't enable interrupts during early bootPekka Enberg2009-06-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As explained by Benjamin Herrenschmidt: Oh and btw, your patch alone doesn't fix powerpc, because it's missing a whole bunch of GFP_KERNEL's in the arch code... You would have to grep the entire kernel for things that check slab_is_available() and even then you'll be missing some. For example, slab_is_available() didn't always exist, and so in the early days on powerpc, we used a mem_init_done global that is set form mem_init() (not perfect but works in practice). And we still have code using that to do the test. Therefore, mask out __GFP_WAIT, __GFP_IO, and __GFP_FS in the slab allocators in early boot code to avoid enabling interrupts. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| * | memcg: fix page_cgroup fatal error in FLATMEMKAMEZAWA Hiroyuki2009-06-121-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, SLAB is configured in very early stage and it can be used in init routine now. But replacing alloc_bootmem() in FLAT/DISCONTIGMEM's page_cgroup() initialization breaks the allocation, now. (Works well in SPARSEMEM case...it supports MEMORY_HOTPLUG and size of page_cgroup is in reasonable size (< 1 << MAX_ORDER.) This patch revive FLATMEM+memory cgroup by using alloc_bootmem. In future, We stop to support FLATMEM (if no users) or rewrite codes for flatmem completely.But this will adds more messy codes and overheads. Reported-by: Li Zefan <lizf@cn.fujitsu.com> Tested-by: Li Zefan <lizf@cn.fujitsu.com> Tested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* | | kmemcheck: add the kmemcheck coreVegard Nossum2009-06-131-0/+1
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | General description: kmemcheck is a patch to the linux kernel that detects use of uninitialized memory. It does this by trapping every read and write to memory that was allocated dynamically (e.g. using kmalloc()). If a memory address is read that has not previously been written to, a message is printed to the kernel log. Thanks to Andi Kleen for the set_memory_4k() solution. Andrew Morton suggested documenting the shadow member of struct page. Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> [export kmemcheck_mark_initialized] [build fix for setup_max_cpus] Signed-off-by: Ingo Molnar <mingo@elte.hu> [rebased for mainline inclusion] Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
* | Merge branch 'for-linus' of git://linux-arm.org/linux-2.6Linus Torvalds2009-06-111-1/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://linux-arm.org/linux-2.6: kmemleak: Add the corresponding MAINTAINERS entry kmemleak: Simple testing module for kmemleak kmemleak: Enable the building of the memory leak detector kmemleak: Remove some of the kmemleak false positives kmemleak: Add modules support kmemleak: Add kmemleak_alloc callback from alloc_large_system_hash kmemleak: Add the vmalloc memory allocation/freeing hooks kmemleak: Add the slub memory allocation/freeing hooks kmemleak: Add the slob memory allocation/freeing hooks kmemleak: Add the slab memory allocation/freeing hooks kmemleak: Add documentation on the memory leak detector kmemleak: Add the base support Manual conflict resolution (with the slab/earlyboot changes) in: drivers/char/vt.c init/main.c mm/slab.c
| * | kmemleak: Add the base supportCatalin Marinas2009-06-111-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the base support for the kernel memory leak detector. It traces the memory allocation/freeing in a way similar to the Boehm's conservative garbage collector, the difference being that the unreferenced objects are not freed but only shown in /sys/kernel/debug/kmemleak. Enabling this feature introduces an overhead to memory allocations. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
* | | init: introduce mm_init()Pekka Enberg2009-06-111-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | As suggested by Christoph Lameter, introduce mm_init() now that we initialize all the kernel memory allocations together. Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* | | vmalloc: use kzalloc() instead of alloc_bootmem()Pekka Enberg2009-06-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can call vmalloc_init() after kmem_cache_init() and use kzalloc() instead of the bootmem allocator when initializing vmalloc data structures. Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Nick Piggin <npiggin@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>