summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* modules: don't hand 0 to vmalloc.Rusty Russell2012-12-146-28/+18
| | | | | | | | | | | | | | | | | | | | In commit d0a21265dfb5fa8a David Rientjes unified various archs' module_alloc implementation (including x86) and removed the graduitous shortcut for size == 0. Then, in commit de7d2b567d040e3b, Joe Perches added a warning for zero-length vmallocs, which can happen without kallsyms on modules with no init sections (eg. zlib_deflate). Fix this once and for all; the module code has to handle zero length anyway, so get it right at the caller and remove the now-gratuitous checks within the arch-specific module_alloc implementations. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42608 Reported-by: Conrad Kostecki <ConiKost@gmx.de> Cc: David Rientjes <rientjes@google.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: Remove a extra null character at the top of module->strtab.Satoru Takeuchi2012-12-141-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a extra null character('\0') at the top of module->strtab for each module. Commit 59ef28b introduced this bug and this patch fixes it. Live dump log of the current linus git kernel(HEAD is 2844a4870): ============================================================================ crash> mod | grep loop ffffffffa01db0a0 loop 16689 (not loaded) [CONFIG_KALLSYMS] crash> module.core_symtab ffffffffa01db0a0 core_symtab = 0xffffffffa01db320crash> rd 0xffffffffa01db320 12 ffffffffa01db320: 0000005500000001 0000000000000000 ....U........... ffffffffa01db330: 0000000000000000 0002007400000002 ............t... ffffffffa01db340: ffffffffa01d8000 0000000000000038 ........8....... ffffffffa01db350: 001a00640000000e ffffffffa01daeb0 ....d........... ffffffffa01db360: 00000000000000a0 0002007400000019 ............t... ffffffffa01db370: ffffffffa01d8068 000000000000001b h............... crash> module.core_strtab ffffffffa01db0a0 core_strtab = 0xffffffffa01dbb30 "" crash> rd 0xffffffffa01dbb30 4 ffffffffa01dbb30: 615f70616d6b0000 66780063696d6f74 ..kmap_atomic.xf ffffffffa01dbb40: 73636e75665f7265 72665f646e696600 er_funcs.find_fr ============================================================================ We expect Just first one byte of '\0', but actually first two bytes are '\0'. Here is The relationship between symtab and strtab. symtab_idx strtab_idx symbol ----------------------------------------------- 0 0x1 "\0" # startab_idx should be 0 1 0x2 "kmap_atomic" 2 0xe "xfer_funcs" 3 0x19 "find_fr..." By applying this patch, it becomes as follows. symtab_idx strtab_idx symbol ----------------------------------------------- 0 0x0 "\0" # extra byte is removed 1 0x1 "kmap_atomic" 2 0xd "xfer_funcs" 3 0x18 "find_fr..." Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Cc: Masaki Kimura <masaki.kimura.kz@hitachi.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constantsDavid Howells2012-12-141-4/+4
| | | | | | | | Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants in the ASN.1 general decoder instead of the equivalent numbers. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* ASN.1: Define indefinite length marker constantDavid Howells2012-12-141-0/+2
| | | | | | | | Define a constant to hold the marker value seen in an indefinite-length element. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* moduleparam: use __UNIQUE_ID()Rusty Russell2012-12-141-4/+2
| | | | Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* __UNIQUE_ID()Rusty Russell2012-12-142-0/+11
| | | | | | | | | | | | Jan Beulich points out __COUNTER__ (gcc 4.3 and above), so let's use that to create unique ids. This is better than __LINE__ which we use today, so provide a wrapper. Stanislaw Gruszka <sgruszka@redhat.com> reported that some module parameters start with a digit, so we need to prepend when we for the unique id. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Jan Beulich <jbeulich@suse.com>
* MODSIGN: Add modules_sign make targetJosh Boyer2012-12-142-0/+38
| | | | | | | | | | | | If CONFIG_MODULE_SIG is set, and 'make modules_sign' is called then this patch will cause the modules to get a signature appended. The make target is intended to be run after 'make modules_install', and will modify the modules in-place in the installed location. It can be used to produce signed modules after they have been processed by distribution build scripts. Signed-off-by: Josh Boyer <jwboyer@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (minor typo fix)
* powerpc: add finit_module syscall.Rusty Russell2012-12-143-1/+3
| | | | | | | | (This is just for Acks: this won't work without the actual syscall patches, sitting in my tree for -next at the moment). Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* ima: support new kernel module syscallMimi Zohar2012-12-147-5/+41
| | | | | | | | | | | | | | | | With the addition of the new kernel module syscall, which defines two arguments - a file descriptor to the kernel module and a pointer to a NULL terminated string of module arguments - it is now possible to measure and appraise kernel modules like any other file on the file system. This patch adds support to measure and appraise kernel modules in an extensible and consistent manner. To support filesystems without extended attribute support, additional patches could pass the signature as the first parameter. Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* add finit_module syscall to asm-genericKees Cook2012-12-141-1/+3
| | | | | | | | This adds the finit_module syscall to the generic syscall list. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* ARM: add finit_module syscall to ARMKees Cook2012-12-142-0/+2
| | | | | | | | Add finit_module syscall to the ARM syscall list. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* security: introduce kernel_module_from_file hookKees Cook2012-12-144-0/+35
| | | | | | | | | | | | | | | Now that kernel module origins can be reasoned about, provide a hook to the LSMs to make policy decisions about the module file. This will let Chrome OS enforce that loadable kernel modules can only come from its read-only hash-verified root filesystem. Other LSMs can, for example, read extended attributes for signatures, etc. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@us.ibm.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* module: add flags arg to sys_finit_module()Rusty Russell2012-12-143-15/+35
| | | | | | | | | | Thanks to Michael Kerrisk for keeping us honest. These flags are actually useful for eliminating the only case where kmod has to mangle a module's internals: for overriding module versioning. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Acked-by: Kees Cook <keescook@chromium.org>
* module: add syscall to load module from fdKees Cook2012-12-145-148/+223
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of the effort to create a stronger boundary between root and kernel, Chrome OS wants to be able to enforce that kernel modules are being loaded only from our read-only crypto-hash verified (dm_verity) root filesystem. Since the init_module syscall hands the kernel a module as a memory blob, no reasoning about the origin of the blob can be made. Earlier proposals for appending signatures to kernel modules would not be useful in Chrome OS, since it would involve adding an additional set of keys to our kernel and builds for no good reason: we already trust the contents of our root filesystem. We don't need to verify those kernel modules a second time. Having to do signature checking on module loading would slow us down and be redundant. All we need to know is where a module is coming from so we can say yes/no to loading it. If a file descriptor is used as the source of a kernel module, many more things can be reasoned about. In Chrome OS's case, we could enforce that the module lives on the filesystem we expect it to live on. In the case of IMA (or other LSMs), it would be possible, for example, to examine extended attributes that may contain signatures over the contents of the module. This introduces a new syscall (on x86), similar to init_module, that has only two arguments. The first argument is used as a file descriptor to the module and the second argument is a pointer to the NULL terminated string of module arguments. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
* modsign: add symbol prefix to certificate listJames Hogan2012-12-031-2/+2
| | | | | | | | | | | | | | | | Add the arch symbol prefix (if applicable) to the asm definition of modsign_certificate_list and modsign_certificate_list_end. This uses the recently defined SYMBOL_PREFIX which is derived from CONFIG_SYMBOL_PREFIX. This fixes the build of module signing on the blackfin and metag architectures. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: David Howells <dhowells@redhat.com> Cc: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* linux/kernel.h: define SYMBOL_PREFIXJames Hogan2012-12-031-0/+7
| | | | | | | | | | | | | | | Define SYMBOL_PREFIX to be the same as CONFIG_SYMBOL_PREFIX if set by the architecture, or "" otherwise. This avoids the need for ugly #ifdefs whenever symbols are referenced in asm blocks. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Joe Perches <joe@perches.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Jean Delvare <khali@linux-fr.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Merge branch 'for-3.7-fixes' of ↵Linus Torvalds2012-12-021-4/+14
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull late workqueue fixes from Tejun Heo: "Unfortunately, I have two really late fixes. One was for a long-standing bug and queued for 3.8 but I found out about a regression introduced during 3.7-rc1 two days ago, so I'm sending out the two fixes together. The first (long-standing) one is rescuer_thread() entering exit path w/ TASK_INTERRUPTIBLE. It only triggers on workqueue destructions which isn't very frequent and the exit path can usually survive being called with TASK_INTERRUPT, so it was hidden pretty well. Apparently, if you're reiserfs, this could lead to the exiting kthread sleeping indefinitely holding a mutex, which is never good. The fix is simple - restoring TASK_RUNNING before returning from the kthread function. The second one is introduced by the new mod_delayed_work(). mod_delayed_work() was missing special case handling for 0 delay. Instead of queueing the work item immediately, it queued the timer which expires on the closest next tick. Some users of the new function converted from "[__]cancel_delayed_work() + queue_delayed_work()" combination became unhappy with the extra delay. Block unplugging led to noticeably higher number of context switches and intel 6250 wireless failed to associate with WPA-Enterprise network. The fix, again, is fairly simple. The 0 delay special case logic from queue_delayed_work_on() should be moved to __queue_delayed_work() which is shared by both queue_delayed_work_on() and mod_delayed_work_on(). The first one is difficult to trigger and the failure mode for the latter isn't completely catastrophic, so missing these two for 3.7 wouldn't make it a disastrous release, but both bugs are nasty and the fixes are fairly safe" * 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: mod_delayed_work_on() shouldn't queue timer on 0 delay workqueue: exit rescuer_thread() as TASK_RUNNING
| * workqueue: mod_delayed_work_on() shouldn't queue timer on 0 delayTejun Heo2012-12-021-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 8376fe22c7 ("workqueue: implement mod_delayed_work[_on]()") implemented mod_delayed_work[_on]() using the improved try_to_grab_pending(). The function is later used, among others, to replace [__]candel_delayed_work() + queue_delayed_work() combinations. Unfortunately, a delayed_work item w/ zero @delay is handled slightly differently by mod_delayed_work_on() compared to queue_delayed_work_on(). The latter skips timer altogether and directly queues it using queue_work_on() while the former schedules timer which will expire on the closest tick. This means, when @delay is zero, that [__]cancel_delayed_work() + queue_delayed_work_on() makes the target item immediately executable while mod_delayed_work_on() may induce delay of upto a full tick. This somewhat subtle difference breaks some of the converted users. e.g. block queue plugging uses delayed_work for deferred processing and uses mod_delayed_work_on() when the queue needs to be immediately unplugged. The above problem manifested as noticeably higher number of context switches under certain circumstances. The difference in behavior was caused by missing special case handling for 0 delay in mod_delayed_work_on() compared to queue_delayed_work_on(). Joonsoo Kim posted a patch to add it - ("workqueue: optimize mod_delayed_work_on() when @delay == 0")[1]. The patch was queued for 3.8 but it was described as optimization and I missed that it was a correctness issue. As both queue_delayed_work_on() and mod_delayed_work_on() use __queue_delayed_work() for queueing, it seems that the better approach is to move the 0 delay special handling to the function instead of duplicating it in mod_delayed_work_on(). Fix the problem by moving 0 delay special case handling from queue_delayed_work_on() to __queue_delayed_work(). This replaces Joonsoo's patch. [1] http://thread.gmane.org/gmane.linux.kernel/1379011/focus=1379012 Signed-off-by: Tejun Heo <tj@kernel.org> Reported-and-tested-by: Anders Kaseorg <andersk@MIT.EDU> Reported-and-tested-by: Zlatko Calusic <zlatko.calusic@iskon.hr> LKML-Reference: <alpine.DEB.2.00.1211280953350.26602@dr-wily.mit.edu> LKML-Reference: <50A78AA9.5040904@iskon.hr> Cc: Joonsoo Kim <js1304@gmail.com>
| * workqueue: exit rescuer_thread() as TASK_RUNNINGMike Galbraith2012-12-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A rescue thread exiting TASK_INTERRUPTIBLE can lead to a task scheduling off, never to be seen again. In the case where this occurred, an exiting thread hit reiserfs homebrew conditional resched while holding a mutex, bringing the box to its knees. PID: 18105 TASK: ffff8807fd412180 CPU: 5 COMMAND: "kdmflush" #0 [ffff8808157e7670] schedule at ffffffff8143f489 #1 [ffff8808157e77b8] reiserfs_get_block at ffffffffa038ab2d [reiserfs] #2 [ffff8808157e79a8] __block_write_begin at ffffffff8117fb14 #3 [ffff8808157e7a98] reiserfs_write_begin at ffffffffa0388695 [reiserfs] #4 [ffff8808157e7ad8] generic_perform_write at ffffffff810ee9e2 #5 [ffff8808157e7b58] generic_file_buffered_write at ffffffff810eeb41 #6 [ffff8808157e7ba8] __generic_file_aio_write at ffffffff810f1a3a #7 [ffff8808157e7c58] generic_file_aio_write at ffffffff810f1c88 #8 [ffff8808157e7cc8] do_sync_write at ffffffff8114f850 #9 [ffff8808157e7dd8] do_acct_process at ffffffff810a268f [exception RIP: kernel_thread_helper] RIP: ffffffff8144a5c0 RSP: ffff8808157e7f58 RFLAGS: 00000202 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff8107af60 RDI: ffff8803ee491d18 RBP: 0000000000000000 R8: 0000000000000000 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Signed-off-by: Mike Galbraith <mgalbraith@suse.de> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org
* | Merge branch 'for-linus' of ↵Linus Torvalds2012-12-014-10/+21
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "A bunch of fixes; the last one is this cycle regression, the rest are -stable fodder." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix off-by-one in argument passed by iterate_fd() to callbacks lookup_one_len: don't accept . and .. cifs: get rid of blind d_drop() in readdir nfs_lookup_revalidate(): fix a leak don't do blind d_drop() in nfs_prime_dcache()
| * fix off-by-one in argument passed by iterate_fd() to callbacksAl Viro2012-11-301-6/+8
| | | | | | | | | | | | | | | | Noticed by Pavel Roskin; the thing in his patch I disagree with was compensating for that shite in callbacks instead of fixing it once in the iterator itself. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * lookup_one_len: don't accept . and ..Al Viro2012-11-301-0/+5
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * cifs: get rid of blind d_drop() in readdirAl Viro2012-11-301-1/+4
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * nfs_lookup_revalidate(): fix a leakAl Viro2012-11-301-2/+2
| | | | | | | | | | | | | | | | We are leaking fattr and fhandle if we decide that dentry is not to be invalidated, after all (e.g. happens to be a mountpoint). Just free both before that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * don't do blind d_drop() in nfs_prime_dcache()Al Viro2012-11-301-1/+2
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds2012-12-011-0/+7
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU fix from Ingo Molnar: "Fix leaking RCU extended quiescent state, which might trigger warnings and mess up the extended quiescent state tracking logic into thinking that we are in "RCU user mode" while we aren't." * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcu: Fix unrecovered RCU user mode in syscall_trace_leave()
| * \ Merge branch 'rcu/urgent' of ↵Ingo Molnar2012-11-131-0/+7
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/urgent Pull syscall tracing fix from Paul E. McKenney. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | rcu: Fix unrecovered RCU user mode in syscall_trace_leave()Frederic Weisbecker2012-10-281-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On x86-64 syscall exit, 3 non exclusive events may happen looping in the following order: 1) Check if we need resched for user preemption, if so call schedule_user() 2) Check if we have pending signals, if so call do_notify_resume() 3) Check if we do syscall tracing, if so call syscall_trace_leave() However syscall_trace_leave() has been written assuming it directly follows the syscall and forget about the above possible 1st and 2nd steps. Now schedule_user() and do_notify_resume() exit in RCU user mode because they have most chances to resume userspace immediately and this avoids an rcu_user_enter() call in the syscall fast path. So by the time we call syscall_trace_leave(), we may well be in RCU user mode. To fix this up, simply call rcu_user_exit() in the beginning of this function. This fixes some reported RCU uses in extended quiescent state. Reported-by: Dave Jones <davej@redhat.com> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
* | | | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2012-12-0123-135/+194
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "This is mostly about unbreaking architectures that took the UAPI changes in the v3.7 cycle, plus misc fixes." * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf kvm: Fix building perf kvm on non x86 arches perf kvm: Rename perf_kvm to perf_kvm_stat perf: Make perf build for x86 with UAPI disintegration applied perf powerpc: Use uapi/unistd.h to fix build error tools: Pass the target in descend tools: Honour the O= flag when tool build called from a higher Makefile tools: Define a Makefile function to do subdir processing x86: Export asm/{svm.h,vmx.h,perf_regs.h} perf tools: Fix strbuf_addf() when the buffer needs to grow perf header: Fix numa topology printing perf, powerpc: Fix hw breakpoints returning -ENOSPC
| * \ \ \ Merge tag 'perf-urgent-for-mingo' of ↵Ingo Molnar2012-12-0120-126/+181
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Don't build 'perf kvm stat" on non-x86 arches, fix from Xiao Guangrong. - UAPI fixes to get perf building again in non-x86 arches, from David Howells. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | | perf kvm: Fix building perf kvm on non x86 archesXiao Guangrong2012-11-241-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, 'perf kvm stat' is only supported on x86, let its code depend on (__x86_64__ || __i386__) to fix building it on other architectures. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: David Ahern <dsahern@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Dong Hao <haodong@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Boyer <jwboyer@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Runzhen Wang <runzhen@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: x86@kernel.org Link: http://lkml.kernel.org/r/50A9EB89.70901@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | perf kvm: Rename perf_kvm to perf_kvm_statXiao Guangrong2012-11-231-51/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Then let it only be used in 'perf kvm stat'. Preparatory patch to stop trying to build parts of this tool that for now are only supported on x86. Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: David Ahern <dsahern@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Dong Hao <haodong@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Boyer <jwboyer@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Runzhen Wang <runzhen@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: x86@kernel.org Link: http://lkml.kernel.org/r/50A488DD.6090106@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | Merge tag 'perf-uapi-20121119' of ↵Arnaldo Carvalho de Melo2012-11-2320-75/+117
| |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.infradead.org/users/dhowells/linux-headers into perf/urgent Merge reason: UAPI fixes for building on non x86 arches. perf/urgent fixes 2012-11-19
| | * | | | perf: Make perf build for x86 with UAPI disintegration appliedDavid Howells2012-11-1916-58/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make perf build for x86 once the UAPI disintegration patches for that arch have been applied by adding the appropriate -I flags - in the right order - and then converting some #includes that use ../.. notation to find main kernel headerfiles to use <asm/foo.h> and <linux/foo.h> instead. Note that -Iarch/foo/include/uapi is present _before_ -Iarch/foo/include. This makes sure we get the userspace version of the pt_regs struct. Ideally, we wouldn't have the latter -I flag at all, but unfortunately we want asm/svm.h and asm/vmx.h in builtin-kvm.c and these aren't part of the UAPI - at least not for x86. I wonder if the bits outside of the __KERNEL__ guards *should* be transferred there. I note also that perf seems to do its dependency handling manually by listing all the header files it might want to use in LIB_H in the Makefile. Can this be changed to use -MD? Note that to do make this work, we need to export and UAPI disintegrate linux/hw_breakpoint.h, which I think should've been exported previously so that perf can access the bits. We have to do this in the same patch to maintain bisectability. Signed-off-by: David Howells <dhowells@redhat.com>
| | * | | | perf powerpc: Use uapi/unistd.h to fix build errorSukadev Bhattiprolu2012-11-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the 'unistd.h' from arch/powerpc/include/uapi to build the perf tool. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20121107191818.GA16211@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | tools: Pass the target in descendDavid Howells2012-11-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixing: [acme@sandy linux]$ cd tools [acme@sandy tools]$ make clean DESCEND power/cpupower CC lib/cpufreq.o CC lib/sysfs.o LD libcpupower.so.0.0.0 CC utils/helpers/amd.o utils/helpers/amd.c:7:21: error: pci/pci.h: No such file or directory In file included from utils/helpers/amd.c:9: ./utils/helpers/helpers.h:137: warning: ‘struct pci_access’ declared inside parameter list ./utils/helpers/helpers.h:137: warning: its scope is only this definition or declaration, which is probably not what you want ./utils/helpers/helpers.h:139: warning: ‘struct pci_access’ declared inside parameter list utils/helpers/amd.c: In function ‘amd_pci_get_num_boost_states’: utils/helpers/amd.c:120: warning: passing argument 1 of ‘pci_slot_func_init’ from incompatible pointer type ./utils/helpers/helpers.h:138: note: expected ‘struct pci_access **’ but argument is of type ‘struct pci_access **’ utils/helpers/amd.c:125: warning: implicit declaration of function ‘pci_read_byte’ utils/helpers/amd.c:132: warning: implicit declaration of function ‘pci_cleanup’ make[1]: *** [utils/helpers/amd.o] Error 1 make: *** [cpupower_clean] Error 2 [acme@sandy tools]$ Reported-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David Howells <dhowells@redhat.com> Cc: Borislav Petkov <bp@amd64.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/tip-tviyimq6x6nm77sj5lt4t19f@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | tools: Honour the O= flag when tool build called from a higher MakefileDavid Howells2012-11-192-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Honour the O= flag that was passed to a higher level Makefile and then passed down as part of a tool build. To make this work, the top-level Makefile passes the original O= flag and subdir=tools to the tools/Makefile, and that in turn passes subdir=$(O)/$(subdir)/foodir when building tool foo in directory $(O)/$(subdir)/foodir (where the intervening slashes aren't added if an element is missing). For example, take perf. This is found in tools/perf/. Assume we're building into directory ~/zebra/, so we pass O=~/zebra to make. Dependening on where we run the build from, we see: make run in dir $(OUTPUT) dir ======================= ================== linux ~/zebra/tools/perf/ linux/tools ~/zebra/perf/ linux/tools/perf ~/zebra/ and if O= is not set, we get: make run in dir $(OUTPUT) dir ======================= ================== linux linux/tools/perf/ linux/tools linux/tools/perf/ linux/tools/perf linux/tools/perf/ The output directories are created by the descend function if they don't already exist. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Borislav Petkov <bp@amd64.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1378.1352379110@warthog.procyon.org.uk Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | tools: Define a Makefile function to do subdir processingDavid Howells2012-11-192-12/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define a Makefile function that can be called with $(call ...) to wrap the subdir make invocations in tools/Makefile. This will allow us in the next patch to insert bits in there to honour O= flags when called from the top-level Makefile. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Borislav Petkov <bp@amd64.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1378.1352379110@warthog.procyon.org.uk Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | Merge branch 'x86-pre-uapi' into perf-uapiDavid Howells2012-11-191-0/+3
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Howells (1): x86: Export asm/{svm.h,vmx.h,perf_regs.h}
| | | * | | | x86: Export asm/{svm.h,vmx.h,perf_regs.h}David Howells2012-11-081-0/+3
| | | | |/ / | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Export asm/{svm.h,vmx.h,perf_regs.h} so that they can be disintegrated. It looks from previous commits that the first two should have been exported, but the header-y lines weren't added to the Kbuild. I'm guessing that asm/perf_regs.h should be exported too. Signed-off-by: David Howells <dhowells@redhat.com>
| * | | | | Merge tag 'perf-urgent-for-mingo' of ↵Ingo Molnar2012-11-132-4/+6
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: * Fix numa topology printing, from Namhyung Kim. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | | perf tools: Fix strbuf_addf() when the buffer needs to growNamhyung Kim2012-10-301-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was found during chasing down the header output regression. The strbuf_addf() was checking buffer length with a result of vscnprintf() which cannot be greater than that of strbuf_avail(). Since numa topology and pmu mapping info in header were converted to use strbuf, it sometimes caused uninteresting behaviors with the broken strbuf. Fix it by using vsnprintf() which returns desired output string length regardless of the available buffer size and grow the buffer if needed. Reported-by: Andrew Jones <drjones@redhat.com> Tested-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andrew Jones <drjones@redhat.com> Link: http://lkml.kernel.org/r/1350999890-6920-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | perf header: Fix numa topology printingNamhyung Kim2012-10-301-0/+2
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrew reported that the commit 7e94cfcc9d20 ("perf header: Use pre- processed session env when printing") regresses the header output. It was because of a missed string pointer calculation in the loop. Reported-by: Andrew Jones <drjones@redhat.com> Tested-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andrew Jones <drjones@redhat.com> Link: http://lkml.kernel.org/r/1350999890-6920-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | perf, powerpc: Fix hw breakpoints returning -ENOSPCMichael Neuling2012-10-301-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I've been trying to get hardware breakpoints with perf to work on POWER7 but I'm getting the following: % perf record -e mem:0x10000000 true Error: sys_perf_event_open() syscall returned with 28 (No space left on device). /bin/dmesg may provide additional information. Fatal: No CONFIG_PERF_EVENTS=y kernel support configured? true: Terminated (FWIW adding -a and it works fine) Debugging it seems that __reserve_bp_slot() is returning ENOSPC because it thinks there are no free breakpoint slots on this CPU. I have a 2 CPUs, so perf userspace is doing two perf_event_open syscalls to add a counter to each CPU [1]. The first syscall succeeds but the second is failing. On this second syscall, fetch_bp_busy_slots() sets slots.pinned to be 1, despite there being no breakpoint on this CPU. This is because the call the task_bp_pinned, checks all CPUs, rather than just the current CPU. POWER7 only has one hardware breakpoint per CPU (ie. HBP_NUM=1), so we return ENOSPC. The following patch fixes this by checking the associated CPU for each breakpoint in task_bp_pinned. I'm not familiar with this code, so it's provided as a reference to the above issue. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Jovi Zhang <bookjovi@gmail.com> Cc: K Prasad <prasad@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1351268936-2956-1-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | Merge branch 'x86/urgent' of ↵Linus Torvalds2012-12-014-10/+22
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin. This includes the resume-time FPU corruption fix from the chromeos guys, marked for stable. * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, fpu: Avoid FPU lazy restore after suspend x86-32: Unbreak booting on some 486 clones x86, kvm: Remove incorrect redundant assembly constraint
| * | | | | x86, fpu: Avoid FPU lazy restore after suspendVincent Palatin2012-11-302-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a cpu enters S3 state, the FPU state is lost. After resuming for S3, if we try to lazy restore the FPU for a process running on the same CPU, this will result in a corrupted FPU context. Ensure that "fpu_owner_task" is properly invalided when (re-)initializing a CPU, so nobody will try to lazy restore a state which doesn't exist in the hardware. Tested with a 64-bit kernel on a 4-core Ivybridge CPU with eagerfpu=off, by doing thousands of suspend/resume cycles with 4 processes doing FPU operations running. Without the patch, a process is killed after a few hundreds cycles by a SIGFPE. Cc: Duncan Laurie <dlaurie@chromium.org> Cc: Olof Johansson <olofj@chromium.org> Cc: <stable@kernel.org> v3.4+ # for 3.4 need to replace this_cpu_write by percpu_write Signed-off-by: Vincent Palatin <vpalatin@chromium.org> Link: http://lkml.kernel.org/r/1354306532-1014-1-git-send-email-vpalatin@chromium.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86-32: Unbreak booting on some 486 clonesH. Peter Anvin2012-11-271-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There appear to have been some 486 clones, including the "enhanced" version of Am486, which have CPUID but not CR4. These 486 clones had only the FPU flag, if any, unlike the Intel 486s with CPUID, which also had VME and therefore needed CR4. Therefore, look at the basic CPUID flags and require at least one bit other than bit 0 before we modify CR4. Thanks to Christian Ludloff of sandpile.org for confirming this as a problem. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86, kvm: Remove incorrect redundant assembly constraintH. Peter Anvin2012-11-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In __emulate_1op_rax_rdx, we use "+a" and "+d" which are input/output constraints, and *then* use "a" and "d" as input constraints. This is incorrect, but happens to work on some versions of gcc. However, it breaks gcc with -O0 and icc, and may break on future versions of gcc. Reported-and-tested-by: Melanie Blower <melanie.blower@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/B3584E72CFEBED439A3ECA9BCE67A4EF1B17AF90@FMSMSX107.amr.corp.intel.com Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
* | | | | | Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreamingLinus Torvalds2012-12-015-33/+41
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull C6X fixes from Mark Salter. * tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming: c6x: use generic kvm_para.h c6x: remove internal kernel symbols from exported setup.h c6x: fix misleading comment c6x: run do_notify_resume with interrupts enabled
| * | | | | | c6x: use generic kvm_para.hMark Salter2012-11-282-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Mark Salter <msalter@redhat.com>