summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* uprobes/tracing: Introduce is_trace_uprobe_enabled()Oleg Nesterov2013-02-082-3/+7
| | | | | | | | | | | | | | | | | | probe_event_enable/disable() check tu->consumer != NULL to avoid the wrong uprobe_register/unregister(). We are going to kill this pointer and "struct uprobe_trace_consumer", so we add the new helper, is_trace_uprobe_enabled(), which can rely on TP_FLAG_TRACE/TP_FLAG_PROFILE instead. Note: the current logic doesn't look optimal, it is not clear why TP_FLAG_TRACE/TP_FLAG_PROFILE are mutually exclusive, we will probably change this later. Also kill the unused TP_FLAG_UPROBE. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes/tracing: Ensure inode != NULL in create_trace_uprobe()Oleg Nesterov2013-02-081-3/+3
| | | | | | | | | | | | | probe_event_enable/disable() check tu->inode != NULL at the start. This is ugly, if igrab() can fail create_trace_uprobe() should not succeed and "postpone" the failure. And S_ISREG(inode->i_mode) check added by d24d7dbf is not safe. Note: alloc_uprobe() should probably check igrab() != NULL as well. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes/tracing: Fully initialize uprobe_trace_consumer before uprobe_register()Oleg Nesterov2013-02-081-6/+7
| | | | | | | | | | | | | | | | | | | | | probe_event_enable() does uprobe_register() and only after that sets utc->tu and tu->consumer/flags. This can race with uprobe_dispatcher() which can miss these assignments or see them out of order. Nothing really bad can happen, but this doesn't look clean/safe. And this does not allow to use uprobe_consumer->filter() we are going to add, it is called by uprobe_register() and it needs utc->tu. Change this code to initialize everything before uprobe_register(), and reset tu->consumer/flags if it fails. We can't race with event_disable(), the caller holds event_mutex, and if we could the code would be wrong anyway. In fact I think uprobe_trace_consumer should die, it buys nothing but complicates the code. We can simply add uprobe_consumer into trace_uprobe. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes/tracing: Fix dentry/mount leak in create_trace_uprobe()Oleg Nesterov2013-02-081-4/+6
| | | | | | | | create_trace_uprobe() does kern_path() to find ->d_inode, but forgets to do path_put(). We can do this right after igrab(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Add exports for module useJosh Stone2013-02-082-0/+9
| | | | | | | | | | | | | | | | The original pull message for uprobes (commit 654443e2) noted: This tree includes uprobes support in 'perf probe' - but SystemTap (and other tools) can take advantage of user probe points as well. In order to actually be usable in module-based tools like SystemTap, the interface needs to be exported. This patch first adds the obvious exports for uprobe_register and uprobe_unregister. Then it also adds one for task_user_regset_view, which is necessary to get the correct state of userspace registers. Signed-off-by: Josh Stone <jistone@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
* uprobes: Kill the bogus IS_ERR_VALUE(xol_vaddr) checkOleg Nesterov2013-02-081-2/+1
| | | | | | | | | utask->xol_vaddr is either zero or valid, remove the bogus IS_ERR_VALUE() check in xol_free_insn_slot(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Anton Arapov <anton@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Do not allocate current->utask unnecessaryOleg Nesterov2013-02-081-10/+6
| | | | | | | | | | | | handle_swbp() does get_utask() before can_skip_sstep() for no reason, we do not need ->utask if can_skip_sstep() succeeds. Move get_utask() to pre_ssout() who actually starts to use it. Move the initialization of utask->active_uprobe/state as well. This way the whole initialization is consolidated in pre_ssout(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Anton Arapov <anton@redhat.com>
* uprobes: Fix utask->xol_vaddr leak in pre_ssout()Oleg Nesterov2013-02-081-1/+8
| | | | | | | | | pre_ssout() should do xol_free_insn_slot() if arch_uprobe_pre_xol() fails, otherwise nobody will free the allocated slot. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Anton Arapov <anton@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Do not play with utask in xol_get_insn_slot()Oleg Nesterov2013-02-081-16/+21
| | | | | | | | | | | | pre_ssout()->xol_get_insn_slot() path is confusing and buggy. This patch cleanups the code, the next one fixes the bug. Change xol_get_insn_slot() to only allocate the slot and do nothing more, move the initialization of utask->xol_vaddr/vaddr into pre_ssout(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Anton Arapov <anton@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Turn add_utask() into get_utask()Oleg Nesterov2013-02-081-18/+9
| | | | | | | | | | Rename add_utask() into get_utask() and change it to allocate on demand to simplify the caller. Like get_xol_area() it will have more users. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Anton Arapov <anton@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Fold xol_alloc_area() into get_xol_area()Oleg Nesterov2013-02-081-22/+16
| | | | | | | | | | | Currently only xol_get_insn_slot() does get_xol_area() + xol_alloc_area(), but this will have more users and we do not want to copy-and-paste this code. This patch simply moves xol_alloc_area() into get_xol_area() to simplify the current and future code. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Anton Arapov <anton@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Move alloc_page() from xol_add_vma() to xol_alloc_area()Oleg Nesterov2013-02-081-19/+13
| | | | | | | | | | Move alloc_page() from xol_add_vma() to xol_alloc_area() to cleanup the code. This separates the memory allocations and consolidates the -EALREADY cleanups and the error handling. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Anton Arapov <anton@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Change handle_swbp() to expose bp_vaddr to handler_chain()Oleg Nesterov2013-02-083-11/+9
| | | | | | | | | | | | | Change handle_swbp() to set regs->ip = bp_vaddr in advance, this is what consumer->handler() needs but uprobe_get_swbp_addr() is not exported. This also simplifies the code and makes it more consistent across the supported architectures. handle_swbp() becomes the only caller of uprobe_get_swbp_addr(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
* uprobes/x86: Change __skip_sstep() to actually skip the whole insnOleg Nesterov2013-02-081-1/+4
| | | | | | | | | __skip_sstep() doesn't update regs->ip. Currently this is correct but only "by accident" and it doesn't skip the whole insn. Change it to advance ->ip by the length of the detected 0x66*0x90 sequence. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Teach handler_chain() to filter out the probed taskOleg Nesterov2013-02-082-10/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currrently the are 2 problems with pre-filtering: 1. It is not possible to add/remove a task (mm) after uprobe_register() 2. A forked child inherits all breakpoints and uprobe_consumer can not control this. This patch does the first step to improve the filtering. handler_chain() removes the breakpoints installed by this uprobe from current->mm if all handlers return UPROBE_HANDLER_REMOVE. Note that handler_chain() relies on ->register_rwsem to avoid the race with uprobe_register/unregister which can add/del a consumer, or even remove and then insert the new uprobe at the same address. Perhaps we will add uprobe_apply_mm(uprobe, mm, is_register) and teach copy_mm() to do filter(UPROBE_FILTER_FORK), but I think this change makes sense anyway. Note: instead of checking the retcode from uc->handler, we could add uc->filter(UPROBE_FILTER_BPHIT). But I think this is not optimal to call 2 hooks in a row. This buys nothing, and if handler/filter do something nontrivial they will probably do the same work twice. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Reintroduce uprobe_consumer->filter()Oleg Nesterov2013-02-082-7/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Finally add uprobe_consumer->filter() and change consumer_filter() to actually call this method. Note that ->filter() accepts mm_struct, not task_struct. Because: 1. We do not have for_each_mm_user(mm, task). 2. Even if we implement for_each_mm_user(), ->filter() can use it itself. 3. It is not clear who will actually need this interface to do the "nontrivial" filtering. Another argument is "enum uprobe_filter_ctx", consumer->filter() can use it to figure out why/where it was called. For example, perhaps we can add UPROBE_FILTER_PRE_REGISTER used by build_map_info() to quickly "nack" the unwanted mm's. In this case consumer should know that it is called under ->i_mmap_mutex. See the previous discussion at http://marc.info/?t=135214229700002 Perhaps we should pass more arguments, vma/vaddr? Note: this patch obviously can't help to filter out the child created by fork(), this will be addressed later. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Rationalize the usage of filter_chain()Oleg Nesterov2013-02-081-23/+21
| | | | | | | | | | | | | | | | | | | | | | | | filter_chain() was added into install_breakpoint/remove_breakpoint to simplify the initial changes but this is sub-optimal. This patch shifts the callsite to the callers, register_for_each_vma() and uprobe_mmap(). This way: - It will be easier to add the new arguments. This is the main reason, we can do more optimizations later. - register_for_each_vma(is_register => true) can be optimized, we only need to consult the new consumer. The previous consumers were already asked when they called uprobe_register(). This patch also moves the MMF_HAS_UPROBES check from remove_breakpoint(), this allows to avoid the potentionally costly filter_chain(). Note that register_for_each_vma(is_register => false) doesn't really need to take ->consumer_rwsem, but I don't think it makes sense to optimize this and introduce filter_chain_lockless(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Kill uprobes_mutex[], separate alloc_uprobe() and __uprobe_register()Oleg Nesterov2013-02-081-36/+15
| | | | | | | | | | | | | | | | | | | | | | | uprobe_register() and uprobe_unregister() are the only users of mutex_lock(uprobes_hash(inode)), and the only reason why we can't simply remove it is that we need to ensure that delete_uprobe() is not possible after alloc_uprobe() and before consumer_add(). IOW, we need to ensure that when we take uprobe->register_rwsem this uprobe is still valid and we didn't race with _unregister() which called delete_uprobe() in between. With this patch uprobe_register() simply checks uprobe_is_active() and retries if it hits this very unlikely race. uprobes_mutex[] is no longer needed and can be removed. There is another reason for this change, prepare_uprobe() should be folded into alloc_uprobe() and we do not want to hold the extra locks around read_mapping_page/etc. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Anton Arapov <anton@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Introduce uprobe_is_active()Oleg Nesterov2013-02-081-0/+8
| | | | | | | | | | | | | | | | | The lifetime of uprobe->rb_node and uprobe->inode is not refcounted, delete_uprobe() is called when we detect that uprobe has no consumers, and it would be deadly wrong to do this twice. Change delete_uprobe() to WARN() if it was already called. We use RB_CLEAR_NODE() to mark uprobe "inactive", then RB_EMPTY_NODE() can be used to detect this case. RB_EMPTY_NODE() is not used directly, we add the trivial helper for the next change. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Anton Arapov <anton@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Kill uprobe_events, use RB_EMPTY_ROOT() insteadOleg Nesterov2013-02-081-12/+7
| | | | | | | | | | | | uprobe_events counts the number of uprobes in uprobes_tree but it is used as a boolean. We can use RB_EMPTY_ROOT() instead. Probably no_uprobe_events() added by this patch can have more callers, say, mmf_recalc_uprobes(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Anton Arapov <anton@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Kill uprobe->copy_mutexOleg Nesterov2013-02-081-4/+3
| | | | | | | | | | | Now that ->register_rwsem is safe under ->mmap_sem we can kill ->copy_mutex and abuse down_write(&uprobe->consumer_rwsem). This makes prepare_uprobe() even more ugly, but we should kill it anyway. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Kill UPROBE_RUN_HANDLER flagOleg Nesterov2013-02-081-18/+5
| | | | | | | | | | | Simply remove UPROBE_RUN_HANDLER and the corresponding code. It can only help if uprobe has a single consumer, and in fact it is no longer needed after handler_chain() was changed to use ->register_rwsem, we simply can not race with uprobe_register(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Change filter_chain() to iterate ->consumers listOleg Nesterov2013-02-081-8/+13
| | | | | | | | | | Now that it safe to use ->consumer_rwsem under ->mmap_sem we can almost finish the implementation of filter_chain(). It still lacks the actual uc->filter(...) call but othewrwise it is ready, just it pretends that ->filter() always returns true. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Introduce uprobe->register_rwsemOleg Nesterov2013-02-081-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Introduce uprobe->register_rwsem. It is taken for writing around __uprobe_register/unregister. Change handler_chain() to use this sem rather than consumer_rwsem. The main reason for this change is that we have the nasty problem with mmap_sem/consumer_rwsem dependency. filter_chain() needs to protect uprobe->consumers like handler_chain(), but they can not use the same lock. filter_chain() can be called under ->mmap_sem (currently this is always true), but we want to allow ->handler() to play with the probed task's memory, and this needs ->mmap_sem. Alternatively we could use srcu, but synchronize_srcu() is very slow and ->register_rwsem allows us to do more. In particular, we can teach handler_chain() to do remove_breakpoint() if this bp is "nacked" by all consumers, we know that we can't race with the new consumer which does uprobe_register(). See also the next patches. uprobes_mutex[] is almost ready to die. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: _register() should always do register_for_each_vma(true)Oleg Nesterov2013-02-081-18/+13
| | | | | | | | | | | | | | | | | | | | | | To support the filtering uprobe_register() should do register_for_each_vma(true) every time the new consumer comes, we need to install the previously nacked breakpoints. Note: - uprobes_mutex[] should die, what it actually protects is alloc_uprobe(). - UPROBE_RUN_HANDLER should die too, obviously it can't work unless uprobe has a single consumer. The consumer should serialize with _register/_unregister itself. Or this flag should live in uprobe_consumer->state. - Perhaps we can do some optimizations later. For example, if filter_chain() never returns false uprobe can record this fact and avoid the unnecessary register_for_each_vma(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: _unregister() should always do register_for_each_vma(false)Oleg Nesterov2013-02-081-14/+14
| | | | | | | | | | | | | uprobe_unregister() removes the breakpoints only if the last consumer goes away. To support the filtering it should do this every time, we want to remove the breakpoints which nobody else want to keep. Note: given that filter_chain() is not actually implemented, this patch itself doesn't change the behaviour yet, register_for_each_vma(false) is a heavy "nop" unless there are no more consumers. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Introduce filter_chain()Oleg Nesterov2013-02-081-5/+19
| | | | | | | | | | | | | | | | | | | Add the new helper filter_chain(). Currently it is only placeholder, the comment explains what is should do. We will change it later to consult every consumer to decide whether we need to install the swbp. Until then it works as if any consumer returns true, this matches the current behavior. Change install_breakpoint() to call filter_chain() instead of checking uprobe->consumers != NULL. We obviously need this, and this equally closes the race with _unregister(). Change remove_breakpoint() to call this helper too. Currently this is pointless because remove_breakpoint() is only called when the last consumer goes away, but we will change this. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Kill uprobe_consumer->filter()Oleg Nesterov2013-02-083-10/+2
| | | | | | | | | | | uprobe_consumer->filter() is pointless in its current form, kill it. We will add it back, but with the different signature/semantics. Perhaps we will even re-introduce the callsite in handler_chain(), but not to just skip uc->handler(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Kill the pointless inode/uc checks in register/unregisterOleg Nesterov2013-02-081-6/+1
| | | | | | | | | | | | | | | | register/unregister verifies that inode/uc != NULL. For what? This really looks like "hide the potential problem", the caller should pass the valid data. register() also checks uc->next == NULL, probably to prevent the double-register but the caller can do other stupid/wrong things. If we do this check, then we should document that uc->next should be cleared before register() and add BUG_ON(). Also add the small comment about the i_size_read() check. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* uprobes: Move __set_bit(UPROBE_SKIP_SSTEP) into alloc_uprobe()Oleg Nesterov2013-02-081-3/+2
| | | | | | | | Cosmetic. __set_bit(UPROBE_SKIP_SSTEP) is the part of initialization, it is not clear why it is set in insert_uprobe(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
* Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar2013-02-0627-128/+611
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: . Check for flex and bison before continuing building, from Borislav Petkov. . Make event_copy local to mmaps, fixing buffer wrap around problems, from David Ahern. . Add option for runtime switching perf data file in perf report, just press 's' and a menu with the valid files found in the current directory will be presented, from Feng Tang. . Add support to display whole group data for raw columns, from Jiri Olsa. . Fix SIGALRM and pipe read race for the rwtop perl script. from Jiri Olsa. . Fix perf_evsel::exclude_GH handling and add a test to catch regressions, from Jiri Olsa. . Error checking fixes, from Namhyung Kim. . Fix calloc argument ordering, from Paul Gortmaker. . Fix set event list leader, from Stephane Eranian. . Add per processor socket count aggregation in perf stat, from Stephane Eranian. . Fix perf python binding breakage. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf python: Link with sysfs.oArnaldo Carvalho de Melo2013-02-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So that we fix this regression: [root@sandy linux]# perf test -v 15 15: Try 'use perf' in python, checking link problems : --- start --- Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: sysfs_find_mountpoint ---- end ---- Try 'use perf' in python, checking link problems: FAILED! [root@sandy linux]# Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-8pf64bsdywg1gl9m55ul77hg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evlist: Pass the event_group info via perf_attr_detailsArnaldo Carvalho de Melo2013-02-063-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So that we avoid dragging symbol.o into the python binding. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-izjubje7ltd1srji5wb0ygwi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Fix calloc argument orderingPaul Gortmaker2013-02-062-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A sweep of the kernel for regex "kcalloc(sizeof" turned up 2 reversed args, fixed in commit d3d09e18203dba16a9dbdb2b4cc673d90748cdd1 ("EDAC: Fix kcalloc argument order") and also fixed in the networking commit a1b1add07fa794974573d93483d68e373edfe7bd ("gro: Fix kcalloc argument order"). I know that was the regex used, because on seeing the 1st of these changes, I wondered "how many other instances of this are there" and I happened to just use "calloc(sizeof" as a regex and it in turn found these additional reversed args instances in the perf code. In the kcalloc cases, the changes are cosmetic, since the numbers are simply multiplied. I had no desire to go data mining in userspace to see if the same thing held true there, however. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1359594349-25912-1-git-send-email-paul.gortmaker@windriver.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tests: Adding automated parsing tests for group :GH modifiersJiri Olsa2013-02-061-1/+177
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ':GH' group modifier handling was just recently fixed, adding some autommated tests to keep it that way. Adding tests for following events: "{cycles,cache-misses:G}:H" "{cycles,cache-misses:H}:G" "{cycles:G,cache-misses:H}:u" "{cycles:G,cache-misses:H}:uG" Plus fixing test__group2 test. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1359971803-2343-3-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Fix perf_evsel::exclude_GH handlingJiri Olsa2013-02-061-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let the perf_evsel::exclude_GH only prevent the reset of exclude_host and exclude_guest attributes in case they were already set. We cannot reset their values to 0, because they might have other defaults set by event_attr_init. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1359971803-2343-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf perl scripts: Fix SIGALRM and pipe read race for rwtopJiri Olsa2013-02-061-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixing rwtop script race. The issue is caused by rwtop script triggering SIGALRM and underneath pipe reading layer reporting error when interrupted. Fixing this by setting SA_RESTART for rwtop SIGALRM handler, which avoids interruption of the pipe reading layer. The discussion for this issue & fix is here: https://lkml.org/lkml/2012/9/18/123 Signed-off-by: Jiri Olsa <jolsa@redhat.com> Original-patch-by: Andrew Jones <drjones@redhat.com> Cc: Andrew Jones <drjones@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1360080351-3246-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf hists browser: Add support to display whole group data for raw columnsJiri Olsa2013-02-061-27/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we don't display group members' values for raw columns like 'Samples' and 'Period' when in group report mode. Uniting '__hpp__percent_fmt' and '__hpp__raw_fmt' function under new function __hpp__fmt. It's basically '__hpp__percent_fmt' code with new 'fmt_percent' bool parameter added saying whether raw number or percentage should be printed. This way raw columns print out all the group members when in group report mode, like: $ perf record -e '{cycles,cache-misses}' ls ... $ perf report --group --show-total-period --stdio ... # Overhead Period Command Shared Object Symbol # ................ ........................ ....... ................. ................................. # 23.63% 11.24% 3331335 317 ls [kernel.kallsyms] [k] __lock_acquire 12.72% 0.00% 1793100 0 ls [kernel.kallsyms] [k] native_sched_clock 9.72% 0.00% 1369920 0 ls libc-2.14.90.so [.] _nl_find_locale 0.03% 0.07% 4476 2 ls [kernel.kallsyms] [k] intel_pmu_enable_all 0.00% 11.73% 0 331 ls ld-2.14.90.so [.] _dl_cache_libcmp 0.00% 11.06% 0 312 ls [kernel.kallsyms] [k] vma_interval_tree_insert Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1359981185-16819-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf stat: Add per processor socket count aggregationStephane Eranian2013-02-062-12/+123
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds per-processor socket count aggregation for system-wide mode measurements. This is a useful mode to detect imbalance between sockets. To enable this mode, use --aggr-socket in addition to -a. (system-wide). The output includes the socket number and the number of online processors on that socket. This is useful to gauge the amount of aggregation. # ./perf stat -I 1000 -a --aggr-socket -e cycles sleep 2 # time socket cpus counts events 1.000097680 S0 4 5,788,785 cycles 2.000379943 S0 4 27,361,546 cycles 2.001167808 S0 4 818,275 cycles Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1360161962-9675-3-git-send-email-eranian@google.com [ committer note: Added missing man page entry based on above comments ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Add cpu_map processor socket level functionsStephane Eranian2013-02-062-0/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds: - cpu_map__get_socket: get socked id from cpu - cpu_map__build_socket_map: build socket map - cpu_map__socket: gets acutal socket from logical socket Those functions are used by uncore and processor socket-level aggregation modes. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1360161962-9675-2-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evlist: Make event_copy local to mmapsDavid Ahern2013-02-063-29/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I am getting segfaults *after* the time sorting of perf samples where the event type is off the charts: (gdb) bt \#0 0x0807b1b2 in hists__inc_nr_events (hists=0x80a99c4, type=1163281902) at util/hist.c:1225 \#1 0x08070795 in perf_session_deliver_event (session=0x80a9b90, event=0xf7a6aff8, sample=0xffffc318, tool=0xffffc520, file_offset=0) at util/session.c:884 \#2 0x0806f9b9 in flush_sample_queue (s=0x80a9b90, tool=0xffffc520) at util/session.c:555 \#3 0x0806fc53 in process_finished_round (tool=0xffffc520, event=0x0, session=0x80a9b90) at util/session.c:645 This is bizarre because the event has already been processed once -- before it was added to the samples queue -- and the event was found to be sane at that time. There seem to be 2 causes: 1. perf_evlist__mmap_read updates the read location even though there are outstanding references to events sitting in the mmap buffers via the ordered samples queue. 2. There is a single evlist->event_copy for all evlist entries. event_copy is used to handle an event wrapping at the mmap buffer boundary. This patch addresses the second problem - making event_copy local to each perf_mmap. With this change my highly repeatable use case no longer fails. The first problem is much more complicated and will be the subject of a future patch. Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1360098762-61827-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf sort: Check return value of strdup()Namhyung Kim2013-02-061-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When setup_sorting() is called, 'str' is passed to strtok_r() but it's not checked to have a valid pointer. As strtok_r() accepts NULL pointer on a first argument and use the third argument in that case, it can cause a trouble since our third argument, tmp, is not initialized. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1360130237-9963-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf sort: Make setup_sorting returns an error codeNamhyung Kim2013-02-067-10/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the setup_sorting() is called for parsing sort keys and exits if it failed to add the sort key. As it's included in libperf it'd be better returning an error code rather than exiting application inside of the library. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1360130237-9963-2-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf sort: Drop ip_[lr] arguments from _sort__sym_cmp()Namhyung Kim2013-02-061-17/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current _sort__sym_cmp() function is used for comparing symbols between two hist entries on symbol, symbol_from and symbol_to sort keys. Those functions pass addresses of symbols but it's meaningless since it gets over-written inside of the _sort__sym_cmp function to a start address of the symbol. So just get rid of them. This might cause a difference than prior output for branch stacks since it seems not using start address of the symbol but branch address. However AFAICS it'd be same as it gets overwritten anyway. Also remove redundant part of code in sort__sym_cmp(). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1360130237-9963-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Check for flex and bison before continuing buildingBorislav Petkov2013-02-061-3/+10
| | | | | | | | | | | | | | | | | | | | Check whether both executables are present on the system before continuing with the build instead of failing halfway, if either are missing. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1359979554-9160-1-git-send-email-bp@alien8.de Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf evlist: Fix set event list leaderStephane Eranian2013-02-061-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The __perf_evlist__set_leader() was setting the leader for all events in the list except the first. Which means it assumed the first event already had event->leader = event. Seems like this should be the role of the function to also do this. This is a requirement for an upcoming patch set. Signed-off-by: Stephane Eranian <eranian@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130131125437.GA3656@quad Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf report: Enable the runtime switching of perf data fileFeng Tang2013-02-061-11/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is for tui browser only. This patch will check the returned key of tui hists browser, if it's K_SWITH_INPUT_DATA, then recreate a session for the new selected data file. V2: Move the setup_brower() before the "repeat" jump point. Signed-off-by: Feng Tang <feng.tang@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1359873501-24541-2-git-send-email-feng.tang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf hists browser: Add option for runtime switching perf data fileFeng Tang2013-02-062-1/+112
|/ | | | | | | | | | | | | | | | | | | | | | Based on perf report/top/scripts browser integration idea from acme. This will enable user to runtime switch the data file, when this option is selected, it will popup all the legal data files in current working directory, and the filename selected by user is saved in the global variable "input_name", and a new key 'K_SWITCH_INPUT_DATA' will be passed back to the built-in command which will perform the switch. This initial version only enables it for 'perf report'. v2: rebase to latest 'perf/core' branch (6e1d4dd) of acme's perf tree Signed-off-by: Feng Tang <feng.tang@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1359873501-24541-1-git-send-email-feng.tang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf/x86: Allow for architecture specific RDPMC indexesJacob Shin2013-02-063-1/+13
| | | | | | | | | | | | | | Similar to config_base and event_base, allow architecture specific RDPMC ECX values. Signed-off-by: Jacob Shin <jacob.shin@amd.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1360171589-6381-6-git-send-email-jacob.shin@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* perf/x86: Move MSR address offset calculation to architecture specific filesJacob Shin2013-02-062-16/+47
| | | | | | | | | | | | | | | | | | Move counter index to MSR address offset calculation to architecture specific files. This prepares the way for perf_event_amd to enable counter addresses that are not contiguous -- for example AMD Family 15h processors have 6 core performance counters starting at 0xc0010200 and 4 northbridge performance counters starting at 0xc0010240. Signed-off-by: Jacob Shin <jacob.shin@amd.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Stephane Eranian <eranian@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1360171589-6381-5-git-send-email-jacob.shin@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>