| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Modify the s390 copy_oldmem_page() and remap_oldmem_pfn_range() function
for zfcpdump to read from the HSA memory if memory below HSA_SIZE bytes is
requested. Otherwise real memory is used.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Jan Willeke <willeke@de.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce the s390 specific way to map pages from oldmem. The memory area
below OLDMEM_SIZE is mapped with offset OLDMEM_BASE. The other old memory
is mapped directly.
Signed-off-by: Jan Willeke <willeke@de.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Exchange the old relocate mechanism with the new arch function call
override mechanism that allows to create the ELF core header in the 2nd
kernel.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Jan Willeke <willeke@de.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the general-instruction extension facility (z10) a couple of
instructions with a pc-relative long displacement were introduced. The
kprobes support for these instructions however was never implemented.
In result, if anybody ever put a probe on any of these instructions the
result would have been random behaviour after the instruction got executed
within the insn slot.
So lets add the missing handling for these instructions. Since all of the
new instructions have 32 bit signed displacement the easiest solution is
to allocate an insn slot that is within the same 2GB area like the
original instruction and patch the displacement field.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I found the following pattern that leads in to interesting findings:
grep -r "ret.*|=.*__put_user" *
grep -r "ret.*|=.*__get_user" *
grep -r "ret.*|=.*__copy" *
The __put_user() calls in compat_ioctl.c, ptrace compat, signal compat,
since those appear in compat code, we could probably expect the kernel
addresses not to be reachable in the lower 32-bit range, so I think they
might not be exploitable.
For the "__get_user" cases, I don't think those are exploitable: the worse
that can happen is that the kernel will copy kernel memory into in-kernel
buffers, and will fail immediately afterward.
The alpha csum_partial_copy_from_user() seems to be missing the
access_ok() check entirely. The fix is inspired from x86. This could
lead to information leak on alpha. I also noticed that many architectures
map csum_partial_copy_from_user() to csum_partial_copy_generic(), but I
wonder if the latter is performing the access checks on every
architectures.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
_PAGE_SOFT_DIRTY bit should never be set on present pte so add VM_BUG_ON
to catch any potential future abuse.
Also add a comment on _PAGE_SWP_SOFT_DIRTY definition explaining scope of
its usage.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently hugepage migration works well only for pmd-based hugepages
(mainly due to lack of testing,) so we had better not enable migration of
other levels of hugepages until we are ready for it.
Some users of hugepage migration (mbind, move_pages, and migrate_pages) do
page table walk and check pud/pmd_huge() there, so they are safe. But the
other users (softoffline and memory hotremove) don't do this, so without
this patch they can try to migrate unexpected types of hugepages.
To prevent this, we introduce hugepage_migration_support() as an
architecture dependent check of whether hugepage are implemented on a pmd
basis or not. And on some architecture multiple sizes of hugepages are
available, so hugepage_migration_support() also checks hugepage size.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous patch doing vmstats for TLB flushes ("mm: vmstats: tlb flush
counters") effectively missed UP since arch/x86/mm/tlb.c is only compiled
for SMP.
UP systems do not do remote TLB flushes, so compile those counters out on
UP.
arch/x86/kernel/cpu/mtrr/generic.c calls __flush_tlb() directly. This is
probably an optimization since both the mtrr code and __flush_tlb() write
cr4. It would probably be safe to make that a flush_tlb_all() (and then
get these statistics), but the mtrr code is ancient and I'm hesitant to
touch it other than to just stick in the counters.
[akpm@linux-foundation.org: tweak comments]
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was investigating some TLB flush scaling issues and realized that we do
not have any good methods for figuring out how many TLB flushes we are
doing.
It would be nice to be able to do these in generic code, but the
arch-independent calls don't explicitly specify whether we actually need
to do remote flushes or not. In the end, we really need to know if we
actually _did_ global vs. local invalidations, so that leaves us with few
options other than to muck with the counters from arch-specific code.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Heiko Carstens:
"This includes one bpf/jit bug fix where the jit compiler could
sometimes write generated code out of bounds of the allocated memory
area.
The rest of the patches are only cleanups and minor improvements"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/irq: reduce size of external interrupt handler hash array
s390/compat,uid16: use current_cred()
s390/ap_bus: use and-mask instead of a cast
s390/ftrace: avoid pointer arithmetics with function pointers
s390: make various functions static, add declarations to header files
s390/compat signal: add couple of __force annotations
s390/mm: add __releases()/__acquires() annotations to gmap_alloc_table()
s390: keep Kconfig sorted
s390/irq: rework irq subclass handling
s390/irq: use hlists for external interrupt handler array
s390/dumpstack: convert print_symbol to %pSR
s390/perf: Remove print_hex_dump_bytes() debug output
s390: update defconfig
s390/bpf,jit: fix address randomization
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Change the hash algorithm a bit so it produces only values in the
range of 0..31.
This allows to reduce the size of the external interrupt handler hash
array even further while making sure that each of the known interrupt
sources keeps its unique hash with the slightly modified algorithm:
0x1004 --> 12
0x1201 --> 10
0x1202 --> 11
0x1406 --> 16
0x1407 --> 17
0x2401 --> 19
0x2603 --> 22
0x4000 --> 0
This also means that the entire array now fits into exactly one cache
line; so add a proper align statement as well.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
86a264ab "CRED: Wrap current->cred and a few other accessors" converted
all uses of current->cred into current_cred() but left s390 alone.
So let's convert s390 finally as well, only five years later.
This way we also get rid of a sparse warning which complains about a
possible invalid rcu dereference which however is a false positive.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Pointer arithmetics with function pointers is not really defined, but
seems to do the right thing. Let's cast to a void pointer to have a
defined behaviour, at least when using gcc.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
| |
| |
| |
| |
| |
| |
| | |
Make various functions static, add declarations to header files to
fix a couple of sparse findings.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add __force annotations to get rid of a couple of sparse warnings:
arch/s390/kernel/compat_signal.c:335:35:
warning: cast removes address space of expression
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
| |
| |
| |
| |
| |
| | |
Let sparse not incorrectly complain about unbalanced locking.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
| |
| |
| |
| | |
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Let's not add a function for every external interrupt subclass for
which we need reference counting. Just have two register/unregister
functions which have a subclass parameter:
void irq_subclass_register(enum irq_subclass subclass);
void irq_subclass_unregister(enum irq_subclass subclass);
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
| |
| |
| |
| |
| |
| |
| | |
Use hlists for the hashed array of external interrupt handlers.
Reduces the size of the array by 50% (2KB).
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This is the same as what other architectures did.
The change has also the advantage that there won't be any interleaving
messages between printk() and print_symbol().
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
| |
| |
| |
| | |
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add misssing braces to hole calculation. This resulted in an addition
instead of an substraction. Which in turn means that the jit compiler
could try to write out of bounds of the allocated piece of memory.
This bug was introduced with aa2d2c73 "s390/bpf,jit: address randomize
and write protect jit code".
Fixes this one:
[ 37.320956] Unable to handle kernel pointer dereference at virtual kernel address 000003ff80231000
[ 37.320984] Oops: 0011 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 37.320993] Modules linked in: dm_multipath scsi_dh eadm_sch dm_mod ctcm fsm autofs4
[ 37.321007] CPU: 28 PID: 6443 Comm: multipathd Not tainted 3.10.9-61.x.20130829-s390xdefault #1
[ 37.321011] task: 0000004ada778000 ti: 0000004ae3304000 task.ti: 0000004ae3304000
[ 37.321014] Krnl PSW : 0704c00180000000 000000000012d1de (bpf_jit_compile+0x198e/0x23d0)
[ 37.321022] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3
Krnl GPRS: 000000004350207d 0000004a00000001 0000000000000007 000003ff80231002
[ 37.321029] 0000000000000007 000003ff80230ffe 00000000a7740000 000003ff80230f76
[ 37.321032] 000003ffffffffff 000003ff00000000 000003ff0000007d 000000000071e820
[ 37.321035] 0000004adbe99950 000000000071ea18 0000004af3d9e7c0 0000004ae3307b80
[ 37.321046] Krnl Code: 000000000012d1d0: 41305004 la %r3,4(%r5)
000000000012d1d4: e330f0f80021 clg %r3,248(%r15)
#000000000012d1da: a7240009 brc 2,12d1ec
>000000000012d1de: 50805000 st %r8,0(%r5)
000000000012d1e2: e330f0f00004 lg %r3,240(%r15)
000000000012d1e8: 41303004 la %r3,4(%r3)
000000000012d1ec: e380f0e00004 lg %r8,224(%r15)
000000000012d1f2: e330f0f00024 stg %r3,240(%r15)
[ 37.321074] Call Trace:
[ 37.321077] ([<000000000012da78>] bpf_jit_compile+0x2228/0x23d0)
[ 37.321083] [<00000000006007c2>] sk_attach_filter+0xfe/0x214
[ 37.321090] [<00000000005d2d92>] sock_setsockopt+0x926/0xbdc
[ 37.321097] [<00000000005cbfb6>] SyS_setsockopt+0x8a/0xe8
[ 37.321101] [<00000000005ccaa8>] SyS_socketcall+0x264/0x364
[ 37.321106] [<0000000000713f1c>] sysc_nr_ok+0x22/0x28
[ 37.321113] [<000003fffce10ea8>] 0x3fffce10ea8
[ 37.321118] INFO: lockdep is turned off.
[ 37.321121] Last Breaking-Event-Address:
[ 37.321124] [<000000000012d192>] bpf_jit_compile+0x1942/0x23d0
[ 37.321132]
[ 37.321135] Kernel panic - not syncing: Fatal exception: panic_on_oops
Cc: stable@vger.kernel.org # v3.11
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
"Here are a handful of small powerpc fixes.
A couple of section mismatches (always worth fixing), a missing export
of a new symbol causing build failures of modules, a page fault
deadlock fix (interestingly that bug has been around for a LONG time,
though it seems to be more easily triggered by KVM) and fixing pseries
default idle loop in the absence of the cpuidle drivers (such as
during boot)"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Default arch idle could cede processor on pseries
fbdev/ps3fb: Fix section mismatch warning for ps3fb_probe
powerpc: Fix section mismatch warning for prom_rtas_call
powerpc: Fix possible deadlock on page fault
powerpc: Export cpu_to_chip_id() to fix build error
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When adding cpuidle support to pSeries, we introduced two
regressions:
- The new cpuidle backend driver only works under hypervisors
supporting the "SLPLAR" option, which isn't the case of the
old POWER4 hypervisor and the HV "light" used on js2x blades
- The cpuidle driver registers fairly late, meaning that for
a significant portion of the boot process, we end up having
all threads spinning. This slows down the boot process and
increases the overall resource usage if the hypervisor has
shared processors.
This fixes both by implementing a "default" idle that will cede
to the hypervisor when possible, in a very simple way without
all the bells and whisles of cpuidle.
Reported-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
While cross-building for PPC64 I've got
WARNING: vmlinux.o(.text.unlikely+0x1ba): Section mismatch in
reference from the function .prom_rtas_call() to the variable
.init.data:dt_string_start The function .prom_rtas_call() references
the variable __initdata dt_string_start. This is often because
.prom_rtas_call lacks a __initdata annotation or the annotation of
dt_string_start is wrong.
WARNING: vmlinux.o(.meminit.text+0xeb0): Section mismatch in reference
from the function .free_area_init_core.isra.47() to the function
.init.text:.set_pageblock_order() The function __meminit
.free_area_init_core.isra.47() references a function __init
.set_pageblock_order(). If .set_pageblock_order is only used by
.free_area_init_core.isra.47 then annotate .set_pageblock_order with a
matching annotation.
Fix it by proper annotation of prom_rtas_call.
Signed-off-by: Vladimir Murzin <murzin.v@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
stack_grow_into/14082 is trying to acquire lock:
(&mm->mmap_sem){++++++}, at: [<c000000000206d28>] .might_fault+0x78/0xe0
but task is already holding lock:
(&mm->mmap_sem){++++++}, at: [<c0000000007ffd8c>] .do_page_fault+0x24c/0x910
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&mm->mmap_sem);
lock(&mm->mmap_sem);
*** DEADLOCK ***
May be due to missing lock nesting notation
1 lock held by stack_grow_into/14082:
#0: (&mm->mmap_sem){++++++}, at: [<c0000000007ffd8c>] .do_page_fault+0x24c/0x910
stack backtrace:
CPU: 21 PID: 14082 Comm: stack_grow_into Not tainted 3.10.0-10.el7.ppc64.debug #1
Call Trace:
[c0000003d396b850] [c000000000016e7c] .show_stack+0x7c/0x1f0 (unreliable)
[c0000003d396b920] [c000000000813fc8] .dump_stack+0x28/0x3c
[c0000003d396b990] [c000000000124b90] .__lock_acquire+0x1640/0x1800
[c0000003d396bab0] [c00000000012570c] .lock_acquire+0xac/0x250
[c0000003d396bb80] [c000000000206d54] .might_fault+0xa4/0xe0
[c0000003d396bbf0] [c0000000007ffe2c] .do_page_fault+0x2ec/0x910
[c0000003d396be30] [c0000000000092e8] handle_page_fault+0x10/0x30
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
powerpc allmodconfig build fails with:
ERROR: ".cpu_to_chip_id" [drivers/block/mtip32xx/mtip32xx.ko] undefined!
The problem was introduced with commit 15863ff3b (powerpc: Make chip-id
information available to userspace).
Export the missing symbol.
Cc: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Cc: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
"This pull I usually do after rc1 is out but because we have a nice
amount of fixes, some bootup related fixes for ARM, and it is early in
the cycle we figured to do it now to help with tracking of potential
regressions.
The simple ones are the ARM ones - one of the patches fell through the
cracks, other fixes a bootup issue (unconditionally using Xen
functions). Then a fix for a regression causing preempt count being
off (patch causing this went in v3.12).
Lastly are the fixes to make Xen PVHVM guests use PV ticketlocks (Xen
PV already does).
The enablement of that was supposed to be part of the x86 spinlock
merge in commit 816434ec4a67 ("The biggest change here are
paravirtualized ticket spinlocks (PV spinlocks), which bring a nice
speedup on various benchmarks...") but unfortunatly it would cause
hang when booting Xen PVHVM guests. Yours truly got all of the bugs
fixed last week and they (six of them) are included in this pull.
Bug-fixes:
- Boot on ARM without using Xen unconditionally
- On Xen ARM don't run cpuidle/cpufreq
- Fix regression in balloon driver, preempt count warnings
- Fixes to make PVHVM able to use pv ticketlock.
- Revert Xen PVHVM disabling pv ticketlock (aka, re-enable pv ticketlocks)"
* tag 'stable/for-linus-3.12-rc0-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/spinlock: Don't use __initdate for xen_pv_spin
Revert "xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM"
xen/spinlock: Don't setup xen spinlock IPI kicker if disabled.
xen/smp: Update pv_lock_ops functions before alternative code starts under PVHVM
xen/spinlock: We don't need the old structure anymore
xen/spinlock: Fix locking path engaging too soon under PVHVM.
xen/arm: disable cpuidle and cpufreq when linux is running as dom0
xen/p2m: Don't call get_balloon_scratch_page() twice, keep interrupts disabled for multicalls
ARM: xen: only set pm function ptrs for Xen guests
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
As we get compile warnings about .init.data being
used by non-init functions.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This reverts commit 70dd4998cb85f0ecd6ac892cc7232abefa432efb.
Now that the bugs have been resolved we can re-enable the
PV ticketlock implementation under PVHVM Xen guests.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
There is no need to setup this kicker IPI if we are never going
to use the paravirtualized ticketlock mechanism.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Before this patch we would patch all of the pv_lock_ops sites
using alternative assembler. Then later in the bootup cycle
change the unlock_kick and lock_spinning to the Xen specific -
without re patching.
That meant that for the core of the kernel we would be running
with the baremetal version of unlock_kick and lock_spinning while
for modules we would have the proper Xen specific slowpaths.
As most of the module uses some API from the core kernel that ended
up with slowpath lockers waiting forever to be kicked (b/c they
would be using the Xen specific slowpath logic). And the
kick never came b/c the unlock path that was taken was the
baremetal one.
On PV we do not have the problem as we initialise before the
alternative code kicks in.
The fix is to make the updating of the pv_lock_ops function
be done before the alternative code starts patching.
Note that this patch fixes issues discovered by commit
f10cd522c5fbfec9ae3cc01967868c9c2401ed23.
("xen: disable PV spinlocks on HVM") wherein it mentioned
PV spinlocks cannot possibly work with the current code because they are
enabled after pvops patching has already been done, and because PV
spinlocks use a different data structure than native spinlocks so we
cannot switch between them dynamically.
The first problem is solved by this patch.
The second problem has been solved by commit
816434ec4a674fcdb3c2221a6dffdc8f34020550
(Merge branch 'x86-spinlocks-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip)
P.S.
There is still the commit 70dd4998cb85f0ecd6ac892cc7232abefa432efb
(xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM) to
revert but that can be done later after all other bugs have been
fixed.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
As we are using the generic ticketlock structs and these
old structures are not needed anymore.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The xen_lock_spinning has a check for the kicker interrupts
and if it is not initialized it will spin normally (not enter
the slowpath).
But for PVHVM case we would initialize the kicker interrupt
before the CPU came online. This meant that if the booting
CPU used a spinlock and went in the slowpath - it would
enter the slowpath and block forever. The forever part because
during bootup: the spinlock would be taken _before_ the CPU
sets itself to be online (more on this further), and we enter
to poll on the event channel forever.
The bootup CPU (see commit fc78d343fa74514f6fd117b5ef4cd27e4ac30236
"xen/smp: initialize IPI vectors before marking CPU online"
for details) and the CPU that started the bootup consult
the cpu_online_mask to determine whether the booting CPU should
get an IPI. The booting CPU has to set itself in this mask via:
set_cpu_online(smp_processor_id(), true);
However, if the spinlock is taken before this (and it is) and
it polls on an event channel - it will never be woken up as
the kernel will never send an IPI to an offline CPU.
Note that the PVHVM logic in sending IPIs is using the HVM
path which has numerous checks using the cpu_online_mask
and cpu_active_mask. See above mention git commit for details.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
|
| |\ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Linux 3.11-rc7
As we need the git commit 28817e9de4f039a1a8c1fe1df2fa2df524626b9e
Author: Chuck Anderson <chuck.anderson@oracle.com>
Date: Tue Aug 6 15:12:19 2013 -0700
xen/smp: initialize IPI vectors before marking CPU online
* tag 'v3.11-rc7': (443 commits)
Linux 3.11-rc7
ARC: [lib] strchr breakage in Big-endian configuration
VFS: collect_mounts() should return an ERR_PTR
bfs: iget_locked() doesn't return an ERR_PTR
efs: iget_locked() doesn't return an ERR_PTR()
proc: kill the extra proc_readfd_common()->dir_emit_dots()
cope with potentially long ->d_dname() output for shmem/hugetlb
usb: phy: fix build breakage
USB: OHCI: add missing PCI PM callbacks to ohci-pci.c
staging: comedi: bug-fix NULL pointer dereference on failed attach
lib/lz4: correct the LZ4 license
memcg: get rid of swapaccount leftovers
nilfs2: fix issue with counting number of bio requests for BIO_EOPNOTSUPP error detection
nilfs2: remove double bio_put() in nilfs_end_bio_write() for BIO_EOPNOTSUPP error
drivers/platform/olpc/olpc-ec.c: initialise earlier
ipv4: expose IPV4_DEVCONF
ipv6: handle Redirect ICMP Message with no Redirected Header option
be2net: fix disabling TX in be_close()
Revert "ACPI / video: Always call acpi_video_init_brightness() on init"
Revert "genetlink: fix family dump race"
...
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into stable/for-linus-3.12
* 'x86/spinlocks' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kvm/guest: Fix sparse warning: "symbol 'klock_waiting' was not declared as static"
kvm: Paravirtual ticketlocks support for linux guests running on KVM hypervisor
kvm guest: Add configuration support to enable debug information for KVM Guests
kvm uapi: Add KICK_CPU and PV_UNHALT definition to uapi
xen, pvticketlock: Allow interrupts to be enabled while blocking
x86, ticketlock: Add slowpath logic
jump_label: Split jumplabel ratelimit
x86, pvticketlock: When paravirtualizing ticket locks, increment by 2
x86, pvticketlock: Use callee-save for lock_spinning
xen, pvticketlocks: Add xen_nopvspin parameter to disable xen pv ticketlocks
xen, pvticketlock: Xen implementation for PV ticket locks
xen: Defer spinlock setup until boot CPU setup
x86, ticketlock: Collapse a layer of functions
x86, ticketlock: Don't inline _spin_unlock when using paravirt spinlocks
x86, spinlock: Replace pv spinlocks with pv ticketlocks
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
When linux is running as dom0, Xen doesn't show the physical cpu but a
virtual CPU.
On some ARM SOC (for instance the exynos 5250), linux registers callbacks
for cpuidle and cpufreq. When these callbacks are called, they will modify
directly the physical cpu not the virtual one. It can impact the whole board
instead of only dom0.
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
disabled for multicalls
m2p_remove_override() calls get_balloon_scratch_page() in
MULTI_update_va_mapping() even though it already has pointer to this page from
the earlier call (in scratch_page). This second call doesn't have a matching
put_balloon_scratch_page() thus not restoring preempt count back. (Also, there
is no put_balloon_scratch_page() in the error path.)
In addition, the second multicall uses __xen_mc_entry() which does not disable
interrupts. Rearrange xen_mc_* calls to keep interrupts off while performing
multicalls.
This commit fixes a regression introduced by:
commit ee0726407feaf504dff304fb603652fb2d778b42
Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Date: Tue Jul 23 17:23:54 2013 +0000
xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
xen_pm_init was unconditionally setting pm_power_off and arm_pm_restart
function pointers. This breaks multi-platform kernels. Make this
conditional on running as a Xen guest and make it a late_initcall to
ensure it is setup after platform code for Dom0.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: stable@vger.kernel.org
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Pull drm fixes from Dave Airlie:
"Daniel had some fixes queued up, that were delayed, the stolen memory
ones and vga arbiter ones are quite useful, along with his usual bunch
of stuff, nothing for HSW outputs yet.
The one nouveau fix is for a regression I caused with the poweroff stuff"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (30 commits)
drm/nouveau: fix oops on runtime suspend/resume
drm/i915: Delay disabling of VGA memory until vgacon->fbcon handoff is done
drm/i915: try not to lose backlight CBLV precision
drm/i915: Confine page flips to BCS on Valleyview
drm/i915: Skip stolen region initialisation if none is reserved
drm/i915: fix gpu hang vs. flip stall deadlocks
drm/i915: Hold an object reference whilst we shrink it
drm/i915: fix i9xx_crtc_clock_get for multiplied pixels
drm/i915: handle sdvo input pixel multiplier correctly again
drm/i915: fix hpd work vs. flush_work in the pageflip code deadlock
drm/i915: fix up the relocate_entry refactoring
drm/i915: Fix pipe config warnings when dealing with LVDS fixed mode
drm/i915: Don't call sg_free_table() if sg_alloc_table() fails
i915: Update VGA arbiter support for newer devices
vgaarb: Fix VGA decodes changes
vgaarb: Don't disable resources that are not owned
drm/i915: Pin pages whilst mapping the dma-buf
drm/i915: enable trickle feed on Haswell
x86: add early quirk for reserving Intel graphics stolen memory v5
drm/i915: split PCI IDs out into i915_drm.h v4
...
|
| | |_|/ / /
| |/| | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Systems with Intel graphics controllers set aside memory exclusively for
gfx driver use. This memory is not always marked in the E820 as
reserved or as RAM, and so is subject to overlap from E820 manipulation
later in the boot process. On some systems, MMIO space is allocated on
top, despite the efforts of the "RAM buffer" approach, which simply
rounds memory boundaries up to 64M to try to catch space that may decode
as RAM and so is not suitable for MMIO.
v2: use read_pci_config for 32 bit reads instead of adding a new one
(Chris)
add gen6 stolen size function (Chris)
v3: use a function pointer (Chris)
drop gen2 bits (Daniel)
v4: call e820_sanitize_map after adding the region
v5: fixup comments (Peter)
simplify loop (Chris)
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66726
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66844
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|\ \ \ \ \ \
| |_|_|_|/ /
|/| | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 jumplabel changes from Peter Anvin:
"One more x86 tree for this merge window. This tree improves the
handling of jump labels, so that most of the time we don't have to do
a massive initial patching run.
Furthermore, we will error out of the jump label is not what is
expected, eg if it has been corrupted or tampered with"
* 'x86/jumplabel' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/jump-label: Show where and what was wrong on errors
x86/jump-label: Add safety checks to jump label conversions
x86/jump-label: Do not bother updating nops if they are correct
x86/jump-label: Use best default nops for inital jump label calls
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
When modifying text sections for jump labels, a paranoid check is
performed. If the check fails, the system "bugs". But why it failed
is not shown.
The BUG_ON()s in the jump label update code is replaced with bug_at(ip).
This is a function that will show what pointer failed, and what was
at the location of the failure that made jump label panic.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
As with all modifying of kernel text, we need to be very paranoid.
When converting the jump label locations to and from nops to jumps
a check has been added to make sure what we are replacing is what we
expect, otherwise we bug.
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Baron <jbaron@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
On boot up, the jump label init function scans all the jump label locations
and converts them to the best nop for the machine. If the nop is already
the ideal nop, do not bother with changing it.
Cc: Jason Baron <jbaron@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
As specified by H. Peter Anvin, the best nops for x86 without knowing
the running computer is:
32bit:
0x3e, 0x8d, 0x74, 0x26, 0x00 also known as GENERIC_NOP5_ATOMIC
64bit:
0x0f, 0x1f, 0x44, 0x00, 0x00 also known as P6_NOP5_ATOMIC
Currently the default nop that is used by jump label is:
0xe9 0x00 0x00 0x00 0x00
Which is really a 5byte jump to the next position.
It's better to use a real nop than a jmp.
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jason Baron <jbaron@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Pull CRIS updates from Jesper Nilsson:
"Mostly cleanup and removal of unused configs"
* tag 'cris-for-3.12' of git://jni.nu/cris:
CRIS: drop unused Kconfig symbols
CRIS: Add kvm_para.h which includes generic file
CRIS: remove unused current_regs
CRIS: Remove last traces of legacy RTC drivers
CRIS: remove "config OOM_REBOOT"
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Copied from frv.
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
|