linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	tile: implement gettimeofday() via vDSO	Chris Metcalf	2013-08-13	22	-62/+732
\| \| \| \| \| \| \| \| \| \| \| \| \|	This change creates the framework for vDSO calls, makes the existing rt_sigreturn() mechanism use it, and adds a fast gettimeofday(). Now that we need to expose the vDSO address to userspace, we add AT_SYSINFO_EHDR to the set of aux entries provided to userspace. (You can disable any extra vDSO support by booting with vdso=0, but the rt_sigreturn vDSO page will still be provided.) Note that glibc has supported the tile vDSO since release 2.17. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: support simulator notification for ET_DYN objects	Chris Metcalf	2013-08-13	1	-14/+48
\| \| \| \| \| \| \| \| \| \| \| \|	The tile code notifies the simulator of new ET_EXEC objects starting to execute so that tracing code can properly annotate the objects. However, we didn't support ET_DYN executables like ld.so, so we didn't properly load symbols, etc. This change enables that support; we use a variant of the SIM_CONTROL_DLOPEN simulator notification that newer simulators will recognize and use to set the base address for the next SIM_CONTROL_OS_EXEC notification. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: improve illegal translation interrupt handling	Chris Metcalf	2013-08-13	2	-11/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	First, don't re-enable interrupts blindly in the Linux trap handler. We already handle page faults this way; synchronous interrupts like ILL_TRANS will fire even when interrupts are disabled, and we don't want to re-enable interrupts in that case. For ILL_TRANS, we now pass the ILL_VA_PC reason into the trap handler so we can report it properly; this is the address that caused the illegal translation trap. We print the address as part of the pr_alert() message now if it's coming from the kernel. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: make register dumps more readable	Chris Metcalf	2013-08-13	1	-10/+10
\| \| \| \| \| \| \| \|	It's much easier to read register dumps if you read vertically rather than horizontally, since the register numbers line up and lead the eye down more than to the right. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: improve big-endian support	Chris Metcalf	2013-08-13	3	-31/+23
\| \| \| \| \| \| \| \| \| \| \| \|	First, fix a bug in asm/unaligned.h; we need to just use the asm-generic unaligned.h so we properly choose endian-correct flavors. Second, keep the hv/hypervisor.h ABI fully "native" in the sense that we don't have __BIG_ENDIAN__ ifdefs there. Instead, we use macros in the head_NN.S assembly code to properly extract two 32-bit structure members from a 64-bit register holding the structure. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: support CONFIG_PREEMPT	Chris Metcalf	2013-08-13	14	-45/+98
\| \| \| \| \| \| \| \| \| \|	This change adds support for CONFIG_PREEMPT (full kernel preemption). In addition to the core support, this change includes a number of places where we fix up uses of smp_processor_id() and per-cpu variables. I also eliminate the PAGE_HOME_HERE and PAGE_HOME_UNKNOWN values for page homing, as it turns out they weren't being used. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: remove calls to arch_flush_lazy_mmu_mode()	Chris Metcalf	2013-08-13	2	-5/+2
\| \| \| \| \| \| \|	Since it's a no-op on tile anyway, there's no reason to be calling it in tile-specific code. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: fix some issues in hugepage support	Chris Metcalf	2013-08-13	1	-35/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	First, in huge_pte_offset(), we were erroneously checking pgd_present(), which is always true, rather than pud_present(), which is the thing that tells us if there is a top-level (L0) PTE. Fixing this means we properly look up huge page entries only when the Present bit is actually set in the PTE. Second, use the standard pte_alloc_map() instead of the hand-rolled pte_alloc_hugetlb() routine that basically was written to avoid worrying about CONFIG_HIGHPTE. However, we no longer plan to support HIGHPTE, so a separate routine was just unnecessary code duplication. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: add some tile drivers to MAINTAINERS	Chris Metcalf	2013-08-13	1	-2/+6
\| \| \| \| \| \|	Also, alphabetize the existing entries for tile. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: avoid recursive backtrace faults	Chris Metcalf	2013-08-13	2	-2/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds support for avoiding recursive backtracer crashes; we haven't seen this in practice other than when things are seriously corrupt, but it may help avoid losing the root cause of a crash. Also, don't abort kernel backtracers for invalid userspace PC's. If we do, we lose the ability to backtrace through a userspace call to a bad address above PAGE_OFFSET, even though that it can be perfectly reasonable to continue the backtrace in such a case. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: fast-path unaligned memory access for tilegx	Chris Metcalf	2013-08-13	15	-69/+1996
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change enables unaligned userspace memory access via a kernel fast path on tilegx. The kernel tracks user PC/instruction pairs per-thread using a direct-mapped cache in userspace. The cache maps those PC/instruction pairs to JIT'ed instruction sequences that load or store using byte-wide load store intructions and then synthesize 2-, 4- or 8-byte load or store results. Once an instruction has been seen to generate an unaligned access once, subsequent hits on that instruction typically require overhead of only around 50 cycles if cache and TLB is hot. We support the prctl() PR_GET_UNALIGN / PR_SET_UNALIGN sys call to enable or disable unaligned fixups on a per-process basis. To do this we pull some of the tilepro unaligned support out of the single_step.c file; tilepro uses instruction disassembly for both single-step and unaligned access support. Since tilegx actually has hardware singlestep support, though, it's cleaner to keep the tilegx unaligned access code in a separate file. While we're at it, properly rename the tilepro-specific types, etc., to have tilepro suffixes instead of generic tile suffixes. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: remove unnecessary backslashes in asm-offsets.c	Chris Metcalf	2013-08-12	1	-14/+14
\| \| \| \| \| \| \|	Pointed out by checkpatch. A few of the DEFINE() lines were properly written without backslash continuation; fix the rest. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: fix tilegx vmalloc_sync_all BUG_ON	Chris Metcalf	2013-08-12	1	-1/+2
\| \| \| \| \| \| \|	As specified, the test wasn't correct, and in any case it should be a BUILD_BUG_ON. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: fix comment bug in sys_cmpxchg description	Chris Metcalf	2013-08-12	1	-1/+1
\| \| \| \|	Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: support "memmap" boot parameter	Chris Metcalf	2013-08-12	1	-4/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds support for the "memmap" boot parameter similar to what x86 provides. The tile version supports "memmap=1G$5G", for example, as a way to reserve a 1 GB range starting at PA 5GB. The memory is reserved via bootmem during startup, and we create a suitable "struct resource" marked as "Reserved" so you can see the range reported by /proc/iomem. Up to 64 such regions can currently be reserved on the boot command line. We do not support the x86 options "memmap=nn@ss" (force some memory to be available at the given address) since it's pointless to try to have Linux use memory the Tilera hypervisor hasn't given it. We do not support "memmap=nn#ss" to add an ACPI range for later processing, since we don't support ACPI. We do not support "memmap=exactmap" since we don't support reading the e820 information from the BIOS like x86 does. I did add support for "memmap=nn" (and the synonym "mem=nn") which cap the highest PA value at "nn"; these are both just a synonym for the existing tile boot option "maxmem". Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: various console improvements	Chris Metcalf	2013-08-12	7	-48/+186
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change improves and cleans up the tile console. - We enable HVC_IRQ support on tilegx, with the addition of a new Tilera hypervisor API for tilegx to allow a console IPI. If IPI support is not available we fall back to the previous polling mode. - We simplify the earlyprintk code to use CON_BOOT and eliminate some of the other supporting earlyprintk code. - A new tile_console_write() primitive is used to send output to the console and is factored out of the hvc_tile driver. This lets us support a "sim_console" boot argument to allow using simulator hooks to send output to the "console" as a slightly faster alternative to emulating the hardware more directly. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	tile PCI RC: remove stale include of linux/numa.h	Chris Metcalf	2013-08-06	1	-1/+0
\| \| \| \|	Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: reduce driver's vmalloc space usage	Chris Metcalf	2013-08-06	1	-2/+8
\| \| \| \| \| \| \| \| \| \|	We can take advantage of the fact that bit 29 is hard-wired to zero in register TRIO_TILE_PIO_REGION_SETUP_CFG_ADDR. This is handy since at the moment we only allocate one 4GB region for vmalloc, and with this change we can allocate four or more TRIO MACs without using up all the vmalloc space. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: support PCIe TRIO 0 MAC 0 on Gx72 system	Chris Metcalf	2013-08-06	2	-3/+33
\| \| \| \| \| \| \|	On Tilera Gx72 systems, the logic for figuring out whether a given port is root complex is slightly different. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI DMA: fix bug in non-page-aligned accessors	Chris Metcalf	2013-08-06	1	-2/+2
\| \| \| \| \| \| \| \|	The code incorrectly masked with PAGE_OFFSET instead of PAGE_SIZE-1. This only matters when trying to do a non page-aligned DMA; it was noticed during code inspection. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: add dma_get_required_mask()	Chris Metcalf	2013-08-06	2	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	The standard kernel function dma_get_required_mask() uses the highest DRAM address to determine if 32-bit or 64-bit DMA addressing is needed. This only works on architectures that have direct mapping between the PA and the PCI address space, i.e. those that don't have I/O TLBs or have I/O TLB but choose to use direct mapping. Neither of these are true for tilegx. Whether to use 64-bit DMA should depend on the PCI device's capability only, not on the amount of DRAM installeds, so we now advertise a 64-bit DMA mask unconditionally. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: use proper accessor function	Chris Metcalf	2013-08-06	1	-13/+11
\| \| \| \| \| \| \| \|	Using the low-level hv_dev_pread() API makes assumptions about the layout of datastructures in the Tilera hypervisor API; it's better to use the gxio_XXX accessor and the pcie_trio_ports_property struct. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: bomb comments and whitespace format	Chris Metcalf	2013-08-06	1	-124/+56
\| \| \| \| \| \| \|	This change is purely stylistic but improves the readability of the tile PCI RC driver. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: include pci/pcie/Kconfig	Chris Metcalf	2013-08-06	1	-0/+2
\| \| \| \|	Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: eliminate pci_controller.mem_resources field	Chris Metcalf	2013-08-06	2	-62/+12
\| \| \| \| \| \| \|	The .mem_resources[] field in the pci_controller struct is now obsoleted by the .mem_space and .io_space fields. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: restructure TRIO initialization	Chris Metcalf	2013-08-06	2	-92/+118
\| \| \| \| \| \| \| \| \| \|	The TRIO shim initialization is shared with other kernel drivers such as the endpoint and StreamIO drivers, so reorganize the initialization flow to ensure that the root complex driver properly initializes TRIO state regardless of what kind of TRIO driver will end up using the shim. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI DMA: handle a NULL dev argument properly	Chris Metcalf	2013-08-06	1	-2/+3
\| \| \| \|	Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: support I/O space access	Chris Metcalf	2013-08-06	4	-18/+257
\| \| \| \| \| \| \| \| \| \|	To enable this functionality, configure CONFIG_TILE_PCI_IO. Without this flag, the kernel still assigns I/O address ranges to the devices, but no TRIO resource and mapping support is provided. We assign disjoint I/O address ranges to separate PCIe domains. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: gentler warning for missing plug-in PCI	Chris Metcalf	2013-08-06	2	-4/+11
\| \| \| \| \| \| \|	Besides using pr_info() to print the linkdown status for a plug-in slot, add extra indication that this is expected if the slot is empty. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: support more MSI-X interrupt vectors	Chris Metcalf	2013-08-06	4	-20/+106
\| \| \| \| \| \| \| \| \|	To support PCIe devices with higher number of MSI-X interrupt vectors, e.g. 16 for the LSI RAID card, enhance the Gx RC stack to provide more MSI-X vectors by using the TRIO Scatter Queues, which provide 8 more vectors in addition to ~10 from the Map Mem regions. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: support LSI MEGARAID SAS HBA hybrid dma_ops	Chris Metcalf	2013-08-06	2	-12/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The LSI MEGARAID SAS HBA suffers from the problem where it can do 64-bit DMA to streaming buffers but not to consistent buffers. In other words, 64-bit DMA is used for disk data transfers and 32-bit DMA must be used for control message transfers. According to LSI, the firmware is not fully functional yet. This change implements a kind of hybrid dma_ops to support this. Note that on most other platforms, the 64-bit DMA addressing space is the same as the 32-bit DMA space and they overlap the physical memory space. No special arrangement is needed to support this kind of mixed DMA capability. On TILE-Gx, the 64-bit DMA space is completely separate from the 32-bit DMA space. Due to the use of the IOMMU, the 64-bit DMA space doesn't overlap the physical memory space. On the other hand, the 32-bit DMA space overlaps the physical memory space under 4GB. The separate address spaces make it necessary to have separate dma_ops. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: handle case that PCI link is already up	Chris Metcalf	2013-08-05	1	-13/+28
\| \| \| \| \| \| \| \|	If we are rebooting (e.g. via kexec) then the PCI RC link may already be up. In that case, we don't want to do the software fixup to force the link up, since that can degrade it to Gen1. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: tweak the the pcie_rc_delay support	Chris Metcalf	2013-08-05	1	-19/+16
\| \| \| \| \| \| \|	Allow longer delays if requested, and print the info messages as we are performing the delay, not when parsing the arguments. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: support pci=off boot arg for tilepro	Chris Metcalf	2013-08-05	1	-0/+17
\| \| \| \|	Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: tilepro conflict with PCI and RAM addresses	Chris Metcalf	2013-08-05	1	-4/+5
\| \| \| \| \| \| \|	Fix a bug in the tilepro PCI resource allocation code that could make the bootmem allocator unhappy if 4GB is installed on mshim 0. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile PCI RC: cleanups for tilepro PCI RC	Chris Metcalf	2013-08-05	2	-14/+3
\| \| \| \| \| \| \| \| \|	- remove unneeded <linux/bootmem.h> include in pci.c - eliminate unused pci_controller.first_busno field - prefer msleep to mdelay - remove stale comment about pci_scan_bus_parented() Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: optimize strnlen using SIMD instructions	Ken Steele	2013-08-02	4	-1/+98
\| \| \| \| \| \| \|	Using strlen as a model, add length checking to create strnlen. Signed-off-by: Ken Steele <ken@tilera.com> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: optimize and clean up string functions	Chris Metcalf	2013-08-01	8	-84/+212
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change cleans up the string code in a number of ways: - For memcpy(), fix bug in prefetch and increase distance to 3 lines; optimize for unaligned data; do all loads before wh64 to make memcpy safe for forward-overlapping calls; etc. Performance is improved. - Use new copy_byte() function on tilegx to spread a single byte value out into a full word using the shufflebytes instruction. - Clean up header include ordering to be more canonical, and remove spurious #undefs of function names. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: convert uses of "inv" to "finv"	Chris Metcalf	2013-07-31	9	-103/+32
\| \| \| \| \| \| \| \| \| \|	The "inv" (invalidate) instruction is generally less safe than "finv" (flush and invalidate), as it will drop dirty data from the cache. It turns out we have almost no need for "inv" (other than for the older 32-bit architecture in some limited cases), so convert to "finv" where possible and delete the extra "inv" infrastructure. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	tile: various minor cleanups to hardwall subsystem	Chris Metcalf	2013-07-31	2	-15/+18
\| \| \| \| \| \| \| \| \| \|	First, clean up active hardwalls in exit_thread(). This is a better place than in arch_release_thread_info(). Second, mask out any non-online cpus from the cpumask after validating any required semantics of the cpu set. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	edac: Remove redundant platform_set_drvdata()	Sachin Kamat	2013-07-17	1	-1/+0
\| \| \| \| \| \| \| \| \|	Commit 0998d06310 (device-core: Ensure drvdata = NULL when no driver is bound) removes the need to set driver data field to NULL. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	arch: tile: include: asm: add cmpxchg64() definition	Chen Gang	2013-07-17	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Need add cmpxchg64(), or will cause compiling issue. Need define it as cmpxchg() only for 64-bit operation, since cmpxchg() can support 8 bytes. The related error (with allmodconfig): drivers/block/blockconsole.c: In function ‘bcon_advance_console_bytes’: drivers/block/blockconsole.c:164:2: error: implicit declaration of function ‘cmpxchg64’ [-Werror=implicit-function-declaration] Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
*	Linux 3.11-rc1v3.11-rc1	Linus Torvalds	2013-07-15	2	-1604/+883
\|
*	Merge branch 'slab/for-linus' of ↵	Linus Torvalds	2013-07-15	8	-69/+121
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux Pull slab update from Pekka Enberg: "Highlights: - Fix for boot-time problems on some architectures due to init_lock_keys() not respecting kmalloc_caches boundaries (Christoph Lameter) - CONFIG_SLUB_CPU_PARTIAL requested by RT folks (Joonsoo Kim) - Fix for excessive slab freelist draining (Wanpeng Li) - SLUB and SLOB cleanups and fixes (various people)" I ended up editing the branch, and this avoids two commits at the end that were immediately reverted, and I instead just applied the oneliner fix in between myself. * 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux slub: Check for page NULL before doing the node_match check mm/slab: Give s_next and s_stop slab-specific names slob: Check for NULL pointer before calling ctor() slub: Make cpu partial slab support configurable slab: add kmalloc() to kernel API documentation slab: fix init_lock_keys slob: use DIV_ROUND_UP where possible slub: do not put a slab to cpu partial list when cpu_partial is 0 mm/slub: Use node_nr_slabs and node_nr_objs in get_slabinfo mm/slub: Drop unnecessary nr_partials mm/slab: Fix /proc/slabinfo unwriteable for slab mm/slab: Sharing s_next and s_stop between slab and slub mm/slab: Fix drain freelist excessively slob: Rework #ifdeffery in slab.h mm, slab: moved kmem_cache_alloc_node comment to correct place
\| *	slub: Check for page NULL before doing the node_match check	Steven Rostedt	2013-07-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the -rt kernel (mrg), we hit the following dump: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180 PGD a2d39067 PUD b1641067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: sunrpc cpufreq_ondemand ipv6 tg3 joydev sg serio_raw pcspkr k8temp amd64_edac_mod edac_core i2c_piix4 e100 mii shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom sata_svw ata_generic pata_acpi pata_serverworks radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod CPU 3 Pid: 20878, comm: hackbench Not tainted 3.6.11-rt25.14.el6rt.x86_64 #1 empty empty/Tyan Transport GT24-B3992 RIP: 0010:[<ffffffff811573f1>] [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180 RSP: 0018:ffff8800a9b17d70 EFLAGS: 00010213 RAX: 0000000000000000 RBX: 0000000001200011 RCX: ffff8800a06d8000 RDX: 0000000004d92a03 RSI: 00000000000000d0 RDI: ffff88013b805500 RBP: ffff8800a9b17dc0 R08: ffff88023fd14d10 R09: ffffffff81041cbd R10: 00007f4e3f06e9d0 R11: 0000000000000246 R12: ffff88013b805500 R13: ffff8801ff46af40 R14: 0000000000000001 R15: 0000000000000000 FS: 00007f4e3f06e700(0000) GS:ffff88023fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000a2d3a000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process hackbench (pid: 20878, threadinfo ffff8800a9b16000, task ffff8800a06d8000) Stack: ffff8800a9b17da0 ffffffff81202e08 ffff8800a9b17de0 000000d001200011 0000000001200011 0000000001200011 0000000000000000 0000000000000000 00007f4e3f06e9d0 0000000000000000 ffff8800a9b17e60 ffffffff81041cbd Call Trace: [<ffffffff81202e08>] ? current_has_perm+0x68/0x80 [<ffffffff81041cbd>] copy_process+0xdd/0x15b0 [<ffffffff810a2125>] ? rt_up_read+0x25/0x30 [<ffffffff8104369a>] do_fork+0x5a/0x360 [<ffffffff8107c66b>] ? migrate_enable+0xeb/0x220 [<ffffffff8100b068>] sys_clone+0x28/0x30 [<ffffffff81527423>] stub_clone+0x13/0x20 [<ffffffff81527152>] ? system_call_fastpath+0x16/0x1b Code: 89 fc 89 75 cc 41 89 d6 4d 8b 04 24 65 4c 03 04 25 48 ae 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 74 12 41 83 fe ff 74 27 <48> 8b 00 48 c1 e8 3a 41 39 c6 74 1b 8b 75 cc 4c 89 c9 44 89 f2 RIP [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180 RSP <ffff8800a9b17d70> CR2: 0000000000000000 ---[ end trace 0000000000000002 ]--- Now, this uses SLUB pretty much unmodified, but as it is the -rt kernel with CONFIG_PREEMPT_RT set, spinlocks are mutexes, although they do disable migration. But the SLUB code is relatively lockless, and the spin_locks there are raw_spin_locks (not converted to mutexes), thus I believe this bug can happen in mainline without -rt features. The -rt patch is just good at triggering mainline bugs ;-) Anyway, looking at where this crashed, it seems that the page variable can be NULL when passed to the node_match() function (which does not check if it is NULL). When this happens we get the above panic. As page is only used in slab_alloc() to check if the node matches, if it's NULL I'm assuming that we can say it doesn't and call the __slab_alloc() code. Is this a correct assumption? Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
\| *	mm/slab: Give s_next and s_stop slab-specific names	Wanpeng Li	2013-07-08	3	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Give s_next and s_stop slab-specific names instead of exporting "s_next" and "s_stop". Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
\| *	slob: Check for NULL pointer before calling ctor()	Steven Rostedt	2013-07-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While doing some code inspection, I noticed that the slob constructor method can be called with a NULL pointer. If memory is tight and slob fails to allocate with slob_alloc() or slob_new_pages() it still calls the ctor() method with a NULL pointer. Looking at the first ctor() method I found, I noticed that it can not handle a NULL pointer (I'm sure others probably can't either): static void sighand_ctor(void data) { struct sighand_struct sighand = data; spin_lock_init(&sighand->siglock); init_waitqueue_head(&sighand->signalfd_wqh); } The solution is to only call the ctor() method if allocation succeeded. Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Pekka Enberg <penberg@kernel.org>
\| *	slub: Make cpu partial slab support configurable	Joonsoo Kim	2013-07-07	2	-6/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CPU partial support can introduce level of indeterminism that is not wanted in certain context (like a realtime kernel). Make it configurable. This patch is based on Christoph Lameter's "slub: Make cpu partial slab support configurable V2". Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
\| *	slab: add kmalloc() to kernel API documentation	Michael Opdenacker	2013-07-07	2	-12/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At the moment, kmalloc() isn't even listed in the kernel API documentation (DocBook/kernel-api.html after running "make htmldocs"). Another issue is that the documentation for kmalloc_node() refers to kcalloc()'s documentation to describe its 'flags' parameter, while kcalloc() refered to kmalloc()'s documentation, which doesn't exist! This patch is a proposed fix for this. It also removes the documentation for kmalloc() in include/linux/slob_def.h which isn't included to generate the documentation anyway. This way, kmalloc() is described in only one place. Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
\| *	slab: fix init_lock_keys	Christoph Lameter	2013-07-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some architectures (e.g. powerpc built with CONFIG_PPC_256K_PAGES=y CONFIG_FORCE_MAX_ZONEORDER=11) get PAGE_SHIFT + MAX_ORDER > 26. In 3.10 kernels, CONFIG_LOCKDEP=y with PAGE_SHIFT + MAX_ORDER > 26 makes init_lock_keys() dereference beyond kmalloc_caches[26]. This leads to an unbootable system (kernel panic at initializing SLAB) if one of kmalloc_caches[26...PAGE_SHIFT+MAX_ORDER-1] is not NULL. Fix this by making sure that init_lock_keys() does not dereference beyond kmalloc_caches[26] arrays. Signed-off-by: Christoph Lameter <cl@linux.com> Reported-by: Tetsuo Handa <penguin-kernel@I-Love.SAKURA.ne.jp> Cc: Pekka Enberg <penberg@kernel.org> Cc: <stable@vger.kernel.org> [3.10.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@kernel.org>