summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ARM: 8046/1: proc: add support for the Cortex-A17 processorWill Deacon2014-05-262-0/+12
| | | | | | | | Cortex-A17 has identical initialisation requirements to Cortex-A12, so hook it up in proc-v7.S in the same way. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8028/1: move __fixup_smp out of init sectionRob Herring2014-05-261-1/+1
| | | | | | | | | | | With large kernel builds such as allyesconfig exceeding maximum relative branch offsets, the init section will be too far away to branch to directly. This causes veneers to be added by the linker, but veneers don't work before the MMU is enabled. Fix this by moving __fixup_smp to the .head.text section as it is not very big. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: stacktrace: include exception PC value in stacktrace outputRussell King2014-05-221-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | When we unwind through an exception stack, include the saved PC value into the stack trace: this fills in an otherwise missed functions from the trace (as indicated below): [<c03f4424>] fec_enet_interrupt+0xa0/0xe8 [<c0066c0c>] handle_irq_event_percpu+0x68/0x228 [<c0066e18>] handle_irq_event+0x4c/0x6c [<c006a024>] handle_fasteoi_irq+0xac/0x198 [<c00664b0>] generic_handle_irq+0x4c/0x60 [<c000f014>] handle_IRQ+0x40/0x98 [<c0008554>] gic_handle_irq+0x30/0x64 [<c0012900>] __irq_svc+0x40/0x50 [<c0029030>] __do_softirq+0xe0/0x2fc <==== [<c0029500>] irq_exit+0xb0/0x100 [<c000f018>] handle_IRQ+0x44/0x98 [<c0008554>] gic_handle_irq+0x30/0x64 [<c0012900>] __irq_svc+0x40/0x50 [<c000f34c>] arch_cpu_idle+0x30/0x38 <==== [<c005e1e4>] cpu_startup_entry+0xac/0x214 [<c066297c>] rest_init+0x68/0x80 [<c08ccb10>] start_kernel+0x2fc/0x358 Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: stacktrace: avoid listing stacktrace functions in stacktraceRussell King2014-05-221-5/+13
| | | | | | | | | | | | | | | | | | | | While debugging the FEC ethernet driver using stacktrace, it was noticed that the stacktraces always begin as follows: [<c00117b4>] save_stack_trace_tsk+0x0/0x98 [<c0011870>] save_stack_trace+0x24/0x28 ... This is because the stack trace code includes the stack frames for itself. This is incorrect behaviour, and also leads to "skip" doing the wrong thing (which is the number of stack frames to avoid recording.) Perversely, it does the right thing when passed a non-current thread. Fix this by ensuring that we have a known constant number of frames above the main stack trace function, and always skip these. Cc: <stable@vger.kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: dma-mapping: avoid calling dma_cache_maint_page() on dev=>cpuRussell King2014-05-221-3/+4
| | | | | | | | | | | | | Avoid calling dma_cache_maint_page() when unmapping a DMA_TO_DEVICE buffer. The L1 cache ops never do anything in this circumstance, nor do they ever need to - all that matters for this case is that the data written is visible to the device before DMA starts. What happens during the transfer (provided the buffer is not written to) is of no real consequence. We already do this optimisation for the L2 cache. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8037/1: mm: support big-endian page tablesJianguo Wu2014-04-251-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When enable LPAE and big-endian in a hisilicon board, while specify mem=384M mem=512M@7680M, will get bad page state: Freeing unused kernel memory: 180K (c0466000 - c0493000) BUG: Bad page state in process init pfn:fa442 page:c7749840 count:0 mapcount:-1 mapping: (null) index:0x0 page flags: 0x40000400(reserved) Modules linked in: CPU: 0 PID: 1 Comm: init Not tainted 3.10.27+ #66 [<c000f5f0>] (unwind_backtrace+0x0/0x11c) from [<c000cbc4>] (show_stack+0x10/0x14) [<c000cbc4>] (show_stack+0x10/0x14) from [<c009e448>] (bad_page+0xd4/0x104) [<c009e448>] (bad_page+0xd4/0x104) from [<c009e520>] (free_pages_prepare+0xa8/0x14c) [<c009e520>] (free_pages_prepare+0xa8/0x14c) from [<c009f8ec>] (free_hot_cold_page+0x18/0xf0) [<c009f8ec>] (free_hot_cold_page+0x18/0xf0) from [<c00b5444>] (handle_pte_fault+0xcf4/0xdc8) [<c00b5444>] (handle_pte_fault+0xcf4/0xdc8) from [<c00b6458>] (handle_mm_fault+0xf4/0x120) [<c00b6458>] (handle_mm_fault+0xf4/0x120) from [<c0013754>] (do_page_fault+0xfc/0x354) [<c0013754>] (do_page_fault+0xfc/0x354) from [<c0008400>] (do_DataAbort+0x2c/0x90) [<c0008400>] (do_DataAbort+0x2c/0x90) from [<c0008fb4>] (__dabt_usr+0x34/0x40) The bad pfn:fa442 is not system memory(mem=384M mem=512M@7680M), after debugging, I find in page fault handler, will get wrong pfn from pte just after set pte, as follow: do_anonymous_page() { ... set_pte_at(mm, address, page_table, entry); //debug code pfn = pte_pfn(entry); pr_info("pfn:0x%lx, pte:0x%llxn", pfn, pte_val(entry)); //read out the pte just set new_pte = pte_offset_map(pmd, address); new_pfn = pte_pfn(*new_pte); pr_info("new pfn:0x%lx, new pte:0x%llxn", pfn, pte_val(entry)); ... } pfn: 0x1fa4f5, pte:0xc00001fa4f575f new_pfn:0xfa4f5, new_pte:0xc00000fa4f5f5f //new pfn/pte is wrong. The bug is happened in cpu_v7_set_pte_ext(ptep, pte): An LPAE PTE is a 64bit quantity, passed to cpu_v7_set_pte_ext in the r2 and r3 registers. On an LE kernel, r2 contains the LSB of the PTE, and r3 the MSB. On a BE kernel, the assignment is reversed. Unfortunately, the current code always assumes the LE case, leading to corruption of the PTE when clearing/setting bits. This patch fixes this issue much like it has been done already in the cpu_v7_switch_mm case. CC stable <stable@vger.kernel.org> Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8036/1: Enable IRQs before attempting to read user space in __und_usrCatalin Marinas2014-04-254-8/+10
| | | | | | | | | | | | | | | | | | The Undef abort handler in the kernel reads the undefined instruction from user space. If the page table was modified from another CPU, the user access could fail and do_page_fault() will be executed with interrupts disabled. This can potentially deadlock on ARM11MPCore or on Cortex-A15 with erratum 798181 workaround enabled (both implying IPI for TLB maintenance with page table lock held). This patch enables the IRQs in __und_usr before attempting to read the instruction from user space. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Arun KS <getarunks@gmail.com> Cc: Hartley Sweeten <hsweeten@visionengravers.com> Cc: Ryan Mallon <rmallon@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8035/1: Disable preemption in crunch_task_enable()Catalin Marinas2014-04-251-2/+10
| | | | | | | | | | This patch is in preparation for calling the crunch_task_enable() function with interrupts enabled. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Hartley Sweeten <hsweeten@visionengravers.com> Cc: Ryan Mallon <rmallon@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8034/1: Disable preemption in iwmmxt_task_enable()Catalin Marinas2014-04-251-3/+11
| | | | | | | | This patch is in preparation for calling the iwmmxt_task_enable() function with interrupts enabled. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8031/2: change fixmap mapping region to support 32 CPUsLiu Hua2014-04-235-21/+29
| | | | | | | | | | | | | In 32-bit ARM systems, the fixmap mapping region can support no more than 14 CPUs(total: 896k; one CPU: 64K). And we can configure NR_CPUS up to 32. So there is a mismatch. This patch moves fixmapping region downwards to region 0xffc00000- 0xffe00000. Then the fixmap mapping region can support up to 32 CPUs. Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Liu Hua <sdu.liu@huawei.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8031/1: fixmap: remove FIX_KMAP_BEGIN and FIX_KMAP_ENDLiu Hua2014-04-232-6/+5
| | | | | | | | | | | | | It seems that these two macros are not used by non architecture specific code. And on ARM FIX_KMAP_BEGIN equals zero. This patch removes these two macros. Instead, using FIX_KMAP_NR_PTES to tell the pte number belonged to fixmap mapping region. The code will become clearer when I introduce a bugfix on fixmap mapping region. Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Liu Hua <sdu.liu@huawei.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8013/1: PJ4B: Add cpu_suspend/cpu_resume hooks for PJ4BGregory CLEMENT2014-04-231-3/+25
| | | | | | | | | PJ4B needs extra instructions for suspend and resume, so instead of using the armv7 version, this commit introduces specific versions for PJ4B. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8023/1: remove remnants of the static DMA mappingNicolas Pitre2014-04-232-9/+0
| | | | | | | | | | | | It looks like the static mapping area for DMA was replaced by dynamic allocation into the vmalloc area by commit e9da6e9905e6 but the information in Documentation/arm/memory.txt was not removed accordingly. CONSISTENT_END in arch/arm/include/asm/memory.h has no more users and can be removed as well. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8022/1: ftrace: work with CONFIG_DEBUG_SET_MODULE_RONXRabin Vincent2014-04-231-0/+13
| | | | | | | | | | | Make ftrace work with CONFIG_DEBUG_SET_MODULE_RONX by making module text writable around the place where ftrace does its work, like it is done on x86 in the patch which introduced CONFIG_DEBUG_SET_MODULE_RONX, 84e1c6bb38eb ("x86: Add RO/NX protection for loadable kernel modules"). Tested-by: Mitchel Humpherys <mitchelh@codeaurora.org> Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8011/1: ARM hibernation / suspend-to-diskSebastian Capella2014-04-234-0/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable hibernation for ARM architectures and provide ARM architecture specific calls used during hibernation. The swsusp hibernation framework depends on the platform first having functional suspend/resume. Then, in order to enable hibernation on a given platform, a platform_hibernation_ops structure may need to be registered with the system in order to save/restore any SoC-specific / cpu specific state needing (re)init over a suspend-to-disk/resume-from-disk cycle. For example: - "secure" SoCs that have different sets of control registers and/or different CR reg access patterns. - SoCs with L2 caches as the activation sequence there is SoC-dependent; a full off-on cycle for L2 is not done by the hibernation support code. - SoCs requiring steps on wakeup _before_ the "generic" parts done by cpu_suspend / cpu_resume can work correctly. - SoCs having persistent state which is maintained during suspend and resume, but will be lost during the power off cycle after suspend-to-disk. This is a rebase/rework of Frank Hofmann's v5 hibernation patchset. Acked-by: Russ Dill <Russ.Dill@ti.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Sebastian Capella <sebastian.capella@linaro.org> Acked-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> [fixed duplicate virt_to_pfn() definition --rmk] Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 8008/1: topology: Coding style fixesMark Brown2014-04-151-4/+4
| | | | | | | Use kcalloc() and ULONG_MAX rather than open coding them. Signed-off-by: Mark Brown <broonie@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* Linux 3.15-rc1v3.15-rc1Linus Torvalds2014-04-131-2/+2
|
* mm: Initialize error in shmem_file_aio_read()Geert Uytterhoeven2014-04-131-1/+1
| | | | | | | | | | | | | | | | | Some versions of gcc even warn about it: mm/shmem.c: In function ‘shmem_file_aio_read’: mm/shmem.c:1414: warning: ‘error’ may be used uninitialized in this function If the loop is aborted during the first iteration by one of the two first break statements, error will be uninitialized. Introduced by commit 6e58e79db8a1 ("introduce copy_page_to_iter, kill loop over iovec in generic_file_aio_read()"). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* cifs: Use min_t() when comparing "size_t" and "unsigned long"Geert Uytterhoeven2014-04-131-1/+1
| | | | | | | | | | | | | | | | On 32 bit, size_t is "unsigned int", not "unsigned long", causing the following warning when comparing with PAGE_SIZE, which is always "unsigned long": fs/cifs/file.c: In function ‘cifs_readdata_to_iov’: fs/cifs/file.c:2757: warning: comparison of distinct pointer types lacks a cast Introduced by commit 7f25bba819a3 ("cifs_iovec_read: keep iov_iter between the calls of cifs_readdata_to_iov()"), which changed the signedness of "remaining" and the code from min_t() to min(). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'slab/next' of ↵Linus Torvalds2014-04-135-84/+128
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux Pull slab changes from Pekka Enberg: "The biggest change is byte-sized freelist indices which reduces slab freelist memory usage: https://lkml.org/lkml/2013/12/2/64" * 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: mm: slab/slub: use page->list consistently instead of page->lru mm/slab.c: cleanup outdated comments and unify variables naming slab: fix wrongly used macro slub: fix high order page allocation problem with __GFP_NOFAIL slab: Make allocations with GFP_ZERO slightly more efficient slab: make more slab management structure off the slab slab: introduce byte sized index for the freelist of a slab slab: restrict the number of objects in a slab slab: introduce helper functions to get/set free object slab: factor out calculate nr objects in cache_estimate
| * mm: slab/slub: use page->list consistently instead of page->lruDave Hansen2014-04-113-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'struct page' has two list_head fields: 'lru' and 'list'. Conveniently, they are unioned together. This means that code can use them interchangably, which gets horribly confusing like with this nugget from slab.c: > list_del(&page->lru); > if (page->active == cachep->num) > list_add(&page->list, &n->slabs_full); This patch makes the slab and slub code use page->lru universally instead of mixing ->list and ->lru. So, the new rule is: page->lru is what the you use if you want to keep your page on a list. Don't like the fact that it's not called ->list? Too bad. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * mm/slab.c: cleanup outdated comments and unify variables namingJianyu Zhan2014-04-011-34/+32
| | | | | | | | | | | | | | | | | | | | | | | | As time goes, the code changes a lot, and this leads to that some old-days comments scatter around , which instead of faciliating understanding, but make more confusion. So this patch cleans up them. Also, this patch unifies some variables naming. Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Jianyu Zhan <nasa4836@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: fix wrongly used macroJoonsoo Kim2014-04-011-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 'slab: restrict the number of objects in a slab' uses __builtin_constant_p() on #if macro. It is wrong usage of builtin function, but it is compiled on x86 without any problem, so I can't find it before 0 day build system find it. This commit fixes the situation by using KMALLOC_MIN_SIZE, instead of KMALLOC_SHIFT_LOW. KMALLOC_SHIFT_LOW is parsed to ilog2() on some architecture and this ilog2() uses __builtin_constant_p() and results in the problem. This problem would disappear by using KMALLOC_MIN_SIZE, since it is just constant. Tested-by: David Rientjes <rientjes@google.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slub: fix high order page allocation problem with __GFP_NOFAILJoonsoo Kim2014-03-271-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SLUB already try to allocate high order page with clearing __GFP_NOFAIL. But, when allocating shadow page for kmemcheck, it missed clearing the flag. This trigger WARN_ON_ONCE() reported by Christian Casteyde. https://bugzilla.kernel.org/show_bug.cgi?id=65991 https://lkml.org/lkml/2013/12/3/764 This patch fix this situation by using same allocation flag as original allocation. Reported-by: Christian Casteyde <casteyde.christian@free.fr> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: Make allocations with GFP_ZERO slightly more efficientJoe Perches2014-02-081-8/+8
| | | | | | | | | | | | | | | | | | | | Use the likely mechanism already around valid pointer tests to better choose when to memset to 0 allocations with __GFP_ZERO Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: make more slab management structure off the slabJoonsoo Kim2014-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, the size of the freelist for the slab management diminish, so that the on-slab management structure can waste large space if the object of the slab is large. Consider a 128 byte sized slab. If on-slab is used, 31 objects can be in the slab. The size of the freelist for this case would be 31 bytes so that 97 bytes, that is, more than 75% of object size, are wasted. In a 64 byte sized slab case, no space is wasted if we use on-slab. So set off-slab determining constraint to 128 bytes. Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: introduce byte sized index for the freelist of a slabJoonsoo Kim2014-02-081-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the freelist of a slab consist of unsigned int sized indexes. Since most of slabs have less number of objects than 256, large sized indexes is needless. For example, consider the minimum kmalloc slab. It's object size is 32 byte and it would consist of one page, so 256 indexes through byte sized index are enough to contain all possible indexes. There can be some slabs whose object size is 8 byte. We cannot handle this case with byte sized index, so we need to restrict minimum object size. Since these slabs are not major, wasted memory from these slabs would be negligible. Some architectures' page size isn't 4096 bytes and rather larger than 4096 bytes (One example is 64KB page size on PPC or IA64) so that byte sized index doesn't fit to them. In this case, we will use two bytes sized index. Below is some number for this patch. * Before * kmalloc-512 525 640 512 8 1 : tunables 54 27 0 : slabdata 80 80 0 kmalloc-256 210 210 256 15 1 : tunables 120 60 0 : slabdata 14 14 0 kmalloc-192 1016 1040 192 20 1 : tunables 120 60 0 : slabdata 52 52 0 kmalloc-96 560 620 128 31 1 : tunables 120 60 0 : slabdata 20 20 0 kmalloc-64 2148 2280 64 60 1 : tunables 120 60 0 : slabdata 38 38 0 kmalloc-128 647 682 128 31 1 : tunables 120 60 0 : slabdata 22 22 0 kmalloc-32 11360 11413 32 113 1 : tunables 120 60 0 : slabdata 101 101 0 kmem_cache 197 200 192 20 1 : tunables 120 60 0 : slabdata 10 10 0 * After * kmalloc-512 521 648 512 8 1 : tunables 54 27 0 : slabdata 81 81 0 kmalloc-256 208 208 256 16 1 : tunables 120 60 0 : slabdata 13 13 0 kmalloc-192 1029 1029 192 21 1 : tunables 120 60 0 : slabdata 49 49 0 kmalloc-96 529 589 128 31 1 : tunables 120 60 0 : slabdata 19 19 0 kmalloc-64 2142 2142 64 63 1 : tunables 120 60 0 : slabdata 34 34 0 kmalloc-128 660 682 128 31 1 : tunables 120 60 0 : slabdata 22 22 0 kmalloc-32 11716 11780 32 124 1 : tunables 120 60 0 : slabdata 95 95 0 kmem_cache 197 210 192 21 1 : tunables 120 60 0 : slabdata 10 10 0 kmem_caches consisting of objects less than or equal to 256 byte have one or more objects than before. In the case of kmalloc-32, we have 11 more objects, so 352 bytes (11 * 32) are saved and this is roughly 9% saving of memory. Of couse, this percentage decreases as the number of objects in a slab decreases. Here are the performance results on my 4 cpus machine. * Before * Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs): 229,945,138 cache-misses ( +- 0.23% ) 11.627897174 seconds time elapsed ( +- 0.14% ) * After * Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs): 218,640,472 cache-misses ( +- 0.42% ) 11.504999837 seconds time elapsed ( +- 0.21% ) cache-misses are reduced by this patchset, roughly 5%. And elapsed times are improved by 1%. Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: restrict the number of objects in a slabJoonsoo Kim2014-02-082-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To prepare to implement byte sized index for managing the freelist of a slab, we should restrict the number of objects in a slab to be less or equal to 256, since byte only represent 256 different values. Setting the size of object to value equal or more than newly introduced SLAB_OBJ_MIN_SIZE ensures that the number of objects in a slab is less or equal to 256 for a slab with 1 page. If page size is rather larger than 4096, above assumption would be wrong. In this case, we would fall back on 2 bytes sized index. If minimum size of kmalloc is less than 16, we use it as minimum object size and give up this optimization. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: introduce helper functions to get/set free objectJoonsoo Kim2014-02-081-7/+13
| | | | | | | | | | | | | | | | | | | | In the following patches, to get/set free objects from the freelist is changed so that simple casting doesn't work for it. Therefore, introduce helper functions. Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
| * slab: factor out calculate nr objects in cache_estimateJoonsoo Kim2014-02-081-21/+27
| | | | | | | | | | | | | | | | | | | | | | This logic is not simple to understand so that making separate function helping readability. Additionally, we can use this change in the following patch which implement for freelist to have another sized index in according to nr objects. Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
* | Merge branch 'misc' of ↵Linus Torvalds2014-04-136-120/+196
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull misc kbuild changes from Michal Marek: "Here is the non-critical part of kbuild: - One bogus coccinelle check removed, one check fixed not to suggest the obsolete PTR_RET macro - scripts/tags.sh does not index the generated *.mod.c files - new objdiff tool to list differences between two versions of an object file - A fix for scripts/bootgraph.pl" * 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: scripts/coccinelle: Use PTR_ERR_OR_ZERO scripts/bootgraph.pl: Add graphic header scripts: objdiff: detect object code changes between two commits Coccicheck: Remove memcpy to struct assignment test scripts/tags.sh: Ignore *.mod.c
| * | scripts/coccinelle: Use PTR_ERR_OR_ZEROSachin Kamat2014-04-081-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | PTR_RET is deprecated. Do not recommend its usage anymore. Use PTR_ERR_OR_ZERO instead. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Michal Marek <mmarek@suse.cz>
| * | scripts/bootgraph.pl: Add graphic headerFabian Frederick2014-04-081-2/+40
| | | | | | | | | | | | | | | | | | | | | Adding -header + help function like other .pl in /scripts. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Michal Marek <mmarek@suse.cz>
| * | scripts: objdiff: detect object code changes between two commitsJason Cooper2014-04-082-1/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | objdiff is useful when doing large code cleanups. For example, when removing checkpatch warnings and errors from new drivers in the staging tree. objdiff can be used in conjunction with a git rebase to confirm that each commit made no changes to the resulting object code. It has the same return values as diff(1). This was written specifically to support adding the skein and threefish cryto drivers to the staging tree. I needed a programmatic way to confirm that commits changing >90% of the lines didn't inadvertently change the code. Temporary files (objdump output) are stored in /path/to/linux/.tmp_objdiff 'make mrproper' will remove this directory. Signed-off-by: Jason Cooper <jason@lakedaemon.net> Signed-off-by: Michal Marek <mmarek@suse.cz>
| * | Coccicheck: Remove memcpy to struct assignment testPeter Senna Tschudin2014-03-291-103/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Coccinelle script scripts/coccinelle/misc/memcpy-assign.cocci look for opportunities to replace a call to memcpy by a struct assignment. This patch removes memcpy-assign.cocci as it is not clear that this convention has an impact on the generated code. Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
| * | scripts/tags.sh: Ignore *.mod.cPrarit Bhargava2014-02-062-7/+7
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CONFIG_MODVERSIONS=y results in a .mod.c for every compiled file in the kernel. Issuing a 'make cscope' on a compiled kernel tree results in the cscope files containing *.mod.c files. [prarit@prarit linux]# make cscope [prarit@prarit linux]# cat cscope.files | grep mod.c | wc -l 4807 These files are not useful for cscope and should be ignored. For example, # line filename / context / line 1 105 arch/x86/kvm/kvm-intel.mod.c <<GLOBAL>> { 0x618911fc, __VMLINUX_SYMBOL_STR(numa_node) }, 2 508 drivers/block/mtip32xx/mtip32xx.h <<GLOBAL>> int numa_node; 3 55 drivers/block/mtip32xx/mtip32xx.mod.c <<GLOBAL>> { 0x618911fc, __VMLINUX_SYMBOL_STR(numa_node) }, 4 37 drivers/cpufreq/acpi-cpufreq.mod.c <<GLOBAL>> { 0x618911fc, __VMLINUX_SYMBOL_STR(numa_node) }, <snip> Add an export to RCS_FIND_IGNORE so it can be used in scripts/tags.sh and add explicitly ignore *.mod.c files. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill Tkhai <tkhai@yandex.ru> Cc: Michael Opdenacker <michael.opdenacker@free-electrons.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Michal Marek <mmarek@suse.cz>
* | sym53c8xx_2: Set DID_REQUEUE return code when aborting squeueMikulas Patocka2014-04-131-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes I/O errors with the sym53c8xx_2 driver when the disk returns QUEUE FULL status. When the controller encounters an error (including QUEUE FULL or BUSY status), it aborts all not yet submitted requests in the function sym_dequeue_from_squeue. This function aborts them with DID_SOFT_ERROR. If the disk has full tag queue, the request that caused the overflow is aborted with QUEUE FULL status (and the scsi midlayer properly retries it until it is accepted by the disk), but the sym53c8xx_2 driver aborts the following requests with DID_SOFT_ERROR --- for them, the midlayer does just a few retries and then signals the error up to sd. The result is that disk returning QUEUE FULL causes request failures. The error was reproduced on 53c895 with COMPAQ BD03685A24 disk (rebranded ST336607LC) with command queue 48 or 64 tags. The disk has 64 tags, but under some access patterns it return QUEUE FULL when there are less than 64 pending tags. The SCSI specification allows returning QUEUE FULL anytime and it is up to the host to retry. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | powerpc: Don't try to set LPCR unless we're in hypervisor modePaul Mackerras2014-04-131-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 8f619b5429d9 ("powerpc/ppc64: Do not turn AIL (reloc-on interrupts) too early") added code to set the AIL bit in the LPCR without checking whether the kernel is running in hypervisor mode. The result is that when the kernel is running as a guest (i.e., under PowerKVM or PowerVM), the processor takes a privileged instruction interrupt at that point, causing a panic. The visible result is that the kernel hangs after printing "returning from prom_init". This fixes it by checking for hypervisor mode being available before setting LPCR. If we are not in hypervisor mode, we enable relocation-on interrupts later in pSeries_setup_arch using the H_SET_MODE hcall. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | futex: update documentation for ordering guaranteesDavidlohr Bueso2014-04-131-9/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Commits 11d4616bd07f ("futex: revert back to the explicit waiter counting code") and 69cd9eba3886 ("futex: avoid race between requeue and wake") changed some of the finer details of how we think about futexes. One was a late fix and the other a consequence of overlooking the whole requeuing logic. The first change caused our documentation to be incorrect, and the second made us aware that we need to explicitly add more details to it. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2014-04-13102-433/+420
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull yet more networking updates from David Miller: 1) Various fixes to the new Redpine Signals wireless driver, from Fariya Fatima. 2) L2TP PPP connect code takes PMTU from the wrong socket, fix from Dmitry Petukhov. 3) UFO and TSO packets differ in whether they include the protocol header in gso_size, account for that in skb_gso_transport_seglen(). From Florian Westphal. 4) If VLAN untagging fails, we double free the SKB in the bridging output path. From Toshiaki Makita. 5) Several call sites of sk->sk_data_ready() were referencing an SKB just added to the socket receive queue in order to calculate the second argument via skb->len. This is dangerous because the moment the skb is added to the receive queue it can be consumed in another context and freed up. It turns out also that none of the sk->sk_data_ready() implementations even care about this second argument. So just kill it off and thus fix all these use-after-free bugs as a side effect. 6) Fix inverted test in tcp_v6_send_response(), from Lorenzo Colitti. 7) pktgen needs to do locking properly for LLTX devices, from Daniel Borkmann. 8) xen-netfront driver initializes TX array entries in RX loop :-) From Vincenzo Maffione. 9) After refactoring, some tunnel drivers allow a tunnel to be configured on top itself. Fix from Nicolas Dichtel. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits) vti: don't allow to add the same tunnel twice gre: don't allow to add the same tunnel twice drivers: net: xen-netfront: fix array initialization bug pktgen: be friendly to LLTX devices r8152: check RTL8152_UNPLUG net: sun4i-emac: add promiscuous support net/apne: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO net: ipv6: Fix oif in TCP SYN+ACK route lookup. drivers: net: cpsw: enable interrupts after napi enable and clearing previous interrupts drivers: net: cpsw: discard all packets received when interface is down net: Fix use after free by removing length arg from sk_data_ready callbacks. Drivers: net: hyperv: Address UDP checksum issues Drivers: net: hyperv: Negotiate suitable ndis version for offload support Drivers: net: hyperv: Allocate memory for all possible per-pecket information bridge: Fix double free and memory leak around br_allowed_ingress bonding: Remove debug_fs files when module init fails i40evf: program RSS LUT correctly i40evf: remove open-coded skb_cow_head ixgb: remove open-coded skb_cow_head igbvf: remove open-coded skb_cow_head ...
| * \ Merge branch 'tunnels'David S. Miller2014-04-122-2/+2
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nicolas Dichtel says: ==================== tunnels: don't allow to add the same tunnel twice This series fixes the check of an existing tunnel with the same parameters when a new tunnel is added. I've checked all users of ip_tunnel_newlink(): gre, gretap, ipip and vti. The bug exists only for gre and vti. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | vti: don't allow to add the same tunnel twiceNicolas Dichtel2014-04-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before the patch, it was possible to add two times the same tunnel: ip l a vti1 type vti remote 10.16.0.121 local 10.16.0.249 key 41 ip l a vti2 type vti remote 10.16.0.121 local 10.16.0.249 key 41 It was possible, because ip_tunnel_newlink() calls ip_tunnel_find() with the argument dev->type, which was set only later (when calling ndo_init handler in register_netdevice()). Let's set this type in the setup handler, which is called before newlink handler. Introduced by commit b9959fd3b0fa ("vti: switch to new ip tunnel code"). CC: Cong Wang <amwang@redhat.com> CC: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | gre: don't allow to add the same tunnel twiceNicolas Dichtel2014-04-121-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before the patch, it was possible to add two times the same tunnel: ip l a gre1 type gre remote 10.16.0.121 local 10.16.0.249 ip l a gre2 type gre remote 10.16.0.121 local 10.16.0.249 It was possible, because ip_tunnel_newlink() calls ip_tunnel_find() with the argument dev->type, which was set only later (when calling ndo_init handler in register_netdevice()). Let's set this type in the setup handler, which is called before newlink handler. Introduced by commit c54419321455 ("GRE: Refactor GRE tunneling code."). CC: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | drivers: net: xen-netfront: fix array initialization bugVincenzo Maffione2014-04-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the initialization of an array used in the TX datapath that was mistakenly initialized together with the RX datapath arrays. An out of range array access could happen when RX and TX rings had different sizes. Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'master' of ↵David S. Miller2014-04-1214-203/+91
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates This series contains updates to e1000, e1000e, igb, igbvf, ixgb, ixgbe, ixgbevf and i40evf. Mark fixes an issue with ixgbe and ixgbevf by adding a bit to indicate when workqueues have been initialized. This permits the register read error handling from attempting to use them prior to that, which also generates warnings. Checking for a detected removal after initializing the work queues allows the probe function to return an error without getting the workqueue involved. Further, if the error_detected callback is entered before the workqueues are initialized, exit without recovery since the device initialization was so truncated. Francois Romieu provides several patches to all the drivers to remove the open coded skb_cow_head. Jakub Kicinski provides a fix for igb where last_rx_timestamp should be updated only when Rx time stamp is read. Mitch provides a fix for i40evf where a recent change broke the RSS LUT programming causing it to be programmed with all 0's. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | i40evf: program RSS LUT correctlyMitch A Williams2014-04-111-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A recent change broke the RSS LUT programming, causing it to be programmed with all 0. Correct this by actually assigning the incremented value back to the counter variable so that the increment will be remembered by the calling function. While we're at it, add a proper kernel-doc function comment to our helper function. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| | * | i40evf: remove open-coded skb_cow_headFrancois Romieu2014-04-111-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| | * | ixgb: remove open-coded skb_cow_headFrancois Romieu2014-04-111-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| | * | igbvf: remove open-coded skb_cow_headFrancois Romieu2014-04-111-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| | * | igb: fix last_rx_timestamp usageJakub Kicinski2014-04-113-23/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | last_rx_timestamp should be updated only when rx time stamp is read. Also it's only used with NICs that have per-interface time stamping resources so it can be moved to adapter structure and set in igb_ptp_rx_rgtstamp(). Signed-off-by: Jakub Kicinski <kubakici@wp.pl> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>