summaryrefslogtreecommitdiffstats
path: root/block (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'tracing/core-v2' into tracing-for-linusIngo Molnar2009-04-023-877/+0
|\ | | | | | | | | | | | | | | Conflicts: include/linux/slub_def.h lib/Kconfig.debug mm/slob.c mm/slub.c
| * Merge branches 'tracing/doc', 'tracing/ftrace', 'tracing/printk' and 'linus' ↵Ingo Molnar2009-03-101-16/+9
| |\ | | | | | | | | | into tracing/core
| * \ Merge branches 'tracing/ftrace' and 'linus' into tracing/coreIngo Molnar2009-02-272-41/+69
| |\ \
| * \ \ Merge branch 'linus' into tracing/blktraceIngo Molnar2009-02-193-8/+26
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: block/blktrace.c Semantic merge: kernel/trace/blktrace.c Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | tracing/blktrace: move the tracing file to kernel/traceFrederic Weisbecker2009-02-093-1563/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: cleanup Move blktrace.c to kernel/trace, also move its config entry. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | trace: Call tracing_reset_online_cpus before tracer->init()Arnaldo Carvalho de Melo2009-02-061-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: cleanup To make it easy for ftrace plugin writers, as this was open coded in the existing plugins Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frédéric Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}Arnaldo Carvalho de Melo2009-02-061-15/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: new API These new functions do what previously was being open coded, reducing the number of details ftrace plugin writers have to worry about. It also standardizes the handling of stacktrace, userstacktrace and other trace options we may introduce in the future. With this patch, for instance, the blk tracer (and some others already in the tree) can use the "userstacktrace" /d/tracing/trace_options facility. $ codiff /tmp/vmlinux.before /tmp/vmlinux.after linux-2.6-tip/kernel/trace/trace.c: trace_vprintk | -5 trace_graph_return | -22 trace_graph_entry | -26 trace_function | -45 __ftrace_trace_stack | -27 ftrace_trace_userstack | -29 tracing_sched_switch_trace | -66 tracing_stop | +1 trace_seq_to_user | -1 ftrace_trace_special | -63 ftrace_special | +1 tracing_sched_wakeup_trace | -70 tracing_reset_online_cpus | -1 13 functions changed, 2 bytes added, 355 bytes removed, diff: -353 linux-2.6-tip/block/blktrace.c: __blk_add_trace | -58 1 function changed, 58 bytes removed, diff: -58 linux-2.6-tip/kernel/trace/trace.c: trace_buffer_lock_reserve | +88 trace_buffer_unlock_commit | +86 2 functions changed, 174 bytes added, diff: +174 /tmp/vmlinux.after: 16 functions changed, 176 bytes added, 413 bytes removed, diff: -237 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frédéric Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | ring_buffer: remove unused flags parameterArnaldo Carvalho de Melo2009-02-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: API change, cleanup >From ring_buffer_{lock_reserve,unlock_commit}. $ codiff /tmp/vmlinux.before /tmp/vmlinux.after linux-2.6-tip/kernel/trace/trace.c: trace_vprintk | -14 trace_graph_return | -14 trace_graph_entry | -10 trace_function | -8 __ftrace_trace_stack | -8 ftrace_trace_userstack | -8 tracing_sched_switch_trace | -8 ftrace_trace_special | -12 tracing_sched_wakeup_trace | -8 9 functions changed, 90 bytes removed, diff: -90 linux-2.6-tip/block/blktrace.c: __blk_add_trace | -1 1 function changed, 1 bytes removed, diff: -1 /tmp/vmlinux.after: 10 functions changed, 91 bytes removed, diff: -91 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frédéric Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | trace: Remove unused trace_array_cpu parameterArnaldo Carvalho de Melo2009-02-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: cleanup Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | trace: assign defaults at register_ftrace_eventArnaldo Carvalho de Melo2009-02-051-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: simplification of tracers As all tracers are doing this we might as well do it in register_ftrace_event and save one branch each time we call these callbacks. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | trace: make the trace_event callbacks return enum print_line_tArnaldo Carvalho de Melo2009-02-041-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As they actually all return these enumerators. Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | trace: judicious error checking of trace_seq resultsArnaldo Carvalho de Melo2009-02-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: bugfix and cleanup Some callsites were returning either TRACE_ITER_PARTIAL_LINE if the trace_seq routines (trace_seq_printf, etc) returned 0 meaning its buffer was full, or zero otherwise. But... /* Return values for print_line callback */ enum print_line_t { TRACE_TYPE_PARTIAL_LINE = 0, /* Retry after flushing the seq */ TRACE_TYPE_HANDLED = 1, TRACE_TYPE_UNHANDLED = 2 /* Relay to other output functions */ }; In other cases the return value was not being relayed at all. Most of the time it didn't hurt because the page wasn't get filled, but for correctness sake, handle the return values everywhere. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | blktrace: fix coding style in recent patchesArnaldo Carvalho de Melo2009-02-031-21/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: cleanup Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | blkftrace: binary tracing, synthesizing old formatArnaldo Carvalho de Melo2009-02-031-3/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: new feature With this and a blkrawverify modified not to verify the sequence numbers we can start using the userspace tools to verify that the data produced with the ftrace plugin works as expected. Example: [root@f10-1 ~]# echo 1 > /sys/block/sda/sda1/trace/enable [root@f10-1 ~]# echo bin > /d/tracing/trace_options [root@f10-1 ~]# echo blk > /d/tracing/current_tracer [root@f10-1 ~]# cat /d/tracing/trace_pipe > sda1.blktrace.0 ^C [root@f10-1 ~]# ./blkrawverify --noseq sda1 Verifying sda1 CPU 0 Wrote output to sda1.verify.out [root@f10-1 ~]# cat sda1.verify.out --------------- Verifying sda1 --------------------- Summary for cpu 0: 1349 valid + 0 invalid (100.0%) processed [root@f10-1 ~]# Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | trace: Change struct trace_event callbacks parameter listArnaldo Carvalho de Melo2009-02-031-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: API change The trace_seq and trace_entry are in trace_iterator, where there are more fields that may be needed by tracers, so just pass the tracer_iterator as is already the case for struct tracer->print_line. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | | |
| | \ \ \
| *-. \ \ \ Merge branches 'tracing/ftrace', 'tracing/kmemtrace' and 'linus' into ↵Ingo Molnar2009-02-037-102/+202
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | tracing/core
| * | | | | | blktrace: Use tracing_reset_online_cpusArnaldo Carvalho de Melo2009-01-291-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: cleanup Use tracing_reset_online_cpus instead of open coding it. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | blktrace: the ftrace interface needs CONFIG_TRACINGArnaldo Carvalho de Melo2009-01-271-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: build fix Also mention in the help text that blktrace now can be used using the ftrace interface. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | tracing/blktrace: fix up checkpatch reported problems in ftrace plugin patchArnaldo Carvalho de Melo2009-01-261-15/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also make sure sparse (make C=2 block/blktrace.o) is happy too. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | blktrace: add ftrace pluginArnaldo Carvalho de Melo2009-01-261-5/+646
| |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: New way of using the blktrace infrastructure This drops the requirement of userspace utilities to use the blktrace facility. Configuration is done thru sysfs, adding a "trace" directory to the partition directory where blktrace can be enabled for the associated request_queue. The same filters present in the IOCTL interface are present as sysfs device attributes. The /sys/block/sdX/sdXN/trace/enable file allows tracing without any filters. The other files in this directory: pid, act_mask, start_lba and end_lba can be used with the same meaning as with the IOCTL interface. Using the sysfs interface will only setup the request_queue->blk_trace fields, tracing will only take place when the "blk" tracer is selected via the ftrace interface, as in the following example: To see the trace, one can use the /d/tracing/trace file or the /d/tracign/trace_pipe file, with semantics defined in the ftrace documentation in Documentation/ftrace.txt. [root@f10-1 ~]# cat /t/trace kjournald-305 [000] 3046.491224: 8,1 A WBS 6367 + 8 <- (8,1) 6304 kjournald-305 [000] 3046.491227: 8,1 Q R 6367 + 8 [kjournald] kjournald-305 [000] 3046.491236: 8,1 G RB 6367 + 8 [kjournald] kjournald-305 [000] 3046.491239: 8,1 P NS [kjournald] kjournald-305 [000] 3046.491242: 8,1 I RBS 6367 + 8 [kjournald] kjournald-305 [000] 3046.491251: 8,1 D WB 6367 + 8 [kjournald] kjournald-305 [000] 3046.491610: 8,1 U WS [kjournald] 1 <idle>-0 [000] 3046.511914: 8,1 C RS 6367 + 8 [6367] [root@f10-1 ~]# The default line context (prefix) format is the one described in the ftrace documentation, with the blktrace specific bits using its existing format, described in blkparse(8). If one wants to have the classic blktrace formatting, this is possible by using: [root@f10-1 ~]# echo blk_classic > /t/trace_options [root@f10-1 ~]# cat /t/trace 8,1 0 3046.491224 305 A WBS 6367 + 8 <- (8,1) 6304 8,1 0 3046.491227 305 Q R 6367 + 8 [kjournald] 8,1 0 3046.491236 305 G RB 6367 + 8 [kjournald] 8,1 0 3046.491239 305 P NS [kjournald] 8,1 0 3046.491242 305 I RBS 6367 + 8 [kjournald] 8,1 0 3046.491251 305 D WB 6367 + 8 [kjournald] 8,1 0 3046.491610 305 U WS [kjournald] 1 8,1 0 3046.511914 0 C RS 6367 + 8 [6367] [root@f10-1 ~]# Using the ftrace standard format allows more flexibility, such as the ability of asking for backtraces via trace_options: [root@f10-1 ~]# echo noblk_classic > /t/trace_options [root@f10-1 ~]# echo stacktrace > /t/trace_options [root@f10-1 ~]# cat /t/trace kjournald-305 [000] 3318.826779: 8,1 A WBS 6375 + 8 <- (8,1) 6312 kjournald-305 [000] 3318.826782: <= submit_bio <= submit_bh <= sync_dirty_buffer <= journal_commit_transaction <= kjournald <= kthread <= child_rip kjournald-305 [000] 3318.826836: 8,1 Q R 6375 + 8 [kjournald] kjournald-305 [000] 3318.826837: <= generic_make_request <= submit_bio <= submit_bh <= sync_dirty_buffer <= journal_commit_transaction <= kjournald <= kthread Please read the ftrace documentation to use aditional, standardized tracing filters such as /d/tracing/trace_cpumask, etc. See also /d/tracing/trace_mark to add comments in the trace stream, that is equivalent to the /d/block/sdaN/msg interface. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | Merge branch 'percpu-cpumask-x86-for-linus-2' of ↵Linus Torvalds2009-03-281-1/+1
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'percpu-cpumask-x86-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (682 commits) percpu: fix spurious alignment WARN in legacy SMP percpu allocator percpu: generalize embedding first chunk setup helper percpu: more flexibility for @dyn_size of pcpu_setup_first_chunk() percpu: make x86 addr <-> pcpu ptr conversion macros generic linker script: define __per_cpu_load on all SMP capable archs x86: UV: remove uv_flush_tlb_others() WARN_ON percpu: finer grained locking to break deadlock and allow atomic free percpu: move fully free chunk reclamation into a work percpu: move chunk area map extension out of area allocation percpu: replace pcpu_realloc() with pcpu_mem_alloc() and pcpu_mem_free() x86, percpu: setup reserved percpu area for x86_64 percpu, module: implement reserved allocation and use it for module percpu variables percpu: add an indirection ptr for chunk page map access x86: make embedding percpu allocator return excessive free space percpu: use negative for auto for pcpu_setup_first_chunk() arguments percpu: improve first chunk initial area map handling percpu: cosmetic renames in pcpu_setup_first_chunk() percpu: clean up percpu constants x86: un-__init fill_pud/pmd/pte x86: remove vestigial fix_ioremap prototypes ... Manually merge conflicts in arch/ia64/kernel/irq_ia64.c
| * \ \ \ \ \ Merge branch 'core/percpu' into percpu-cpumask-x86-for-linus-2Ingo Molnar2009-03-271-1/+1
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/parisc/kernel/irq.c arch/x86/include/asm/fixmap_64.h arch/x86/include/asm/setup.h kernel/irq/handle.c Semantic merge: arch/x86/include/asm/fixmap.h Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * \ \ \ \ \ Merge branch 'x86/core' into core/percpuIngo Molnar2009-03-042-41/+69
| | |\ \ \ \ \ \ | | | | |_|_|/ / | | | |/| | | |
| | * | | | | | Merge branch 'tj-percpu' of ↵Ingo Molnar2009-02-241-1/+1
| | |\ \ \ \ \ \ | | | |_|_|_|/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu Conflicts: arch/x86/include/asm/pgtable.h
| | | * | | | | alloc_percpu: add align argument to __alloc_percpu.Rusty Russell2009-02-201-1/+1
| | | | |_|/ / | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This prepares for a real __alloc_percpu, by adding an alignment argument. Only one place uses __alloc_percpu directly, and that's for a string. tj: af_inet also uses __alloc_percpu(), update it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk>
* | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds2009-03-281-0/+1
|\ \ \ \ \ \ \ | |/ / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (119 commits) [SCSI] scsi_dh_rdac: Retry for NOT_READY check condition [SCSI] mpt2sas: make global symbols unique [SCSI] sd: Make revalidate less chatty [SCSI] sd: Try READ CAPACITY 16 first for SBC-2 devices [SCSI] sd: Refactor sd_read_capacity() [SCSI] mpt2sas v00.100.11.15 [SCSI] mpt2sas: add MPT2SAS_MINOR(221) to miscdevice.h [SCSI] ch: Add scsi type modalias [SCSI] 3w-9xxx: add power management support [SCSI] bsg: add linux/types.h include to bsg.h [SCSI] cxgb3i: fix function descriptions [SCSI] libiscsi: fix possbile null ptr session command cleanup [SCSI] iscsi class: remove host no argument from session creation callout [SCSI] libiscsi: pass session failure a session struct [SCSI] iscsi lib: remove qdepth param from iscsi host allocation [SCSI] iscsi lib: have lib create work queue for transmitting IO [SCSI] iscsi class: fix lock dep warning on logout [SCSI] libiscsi: don't cap queue depth in iscsi modules [SCSI] iscsi_tcp: replace scsi_debug/tcp_debug logging with iscsi conn logging [SCSI] libiscsi_tcp: replace tcp_debug/scsi_debug logging with session/conn logging ...
| * | | | | | [SCSI] Make scsi.h independent of the rest of the scsi includesJames Bottomley2009-03-121-0/+1
| | |_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows it to compile and be used on the ps3 platform that wants to use the #define values in scsi.h without actually having CONFIG_SCSI set. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* | | | | | bsg: Remove bogus check against request_queue->max_sectorsBoaz Harrosh2009-03-261-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bsg submits REQ_TYPE_BLOCK_PC so the right check is max_hw_sectors. But I've removed this check because right after, bsg proceeds with calling blk_rq_map_user() which does all the right checks. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | | block: WARN in __blk_put_request() for potential bio leakBoaz Harrosh2009-03-263-17/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Put a WARN_ON in __blk_put_request if it is about to leak bio(s). This is a serious bug that can happen in error handling code paths. For this to work I have fixed a couple of places in block/ where request->bio != NULL ownership was not honored. And a small cleanup at sg_io() while at it. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | | bsg: add support for tail queuingBoaz Harrosh2009-03-241-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently inherited from sg.c bsg will submit asynchronous request at the head-of-the-queue, (using "at_head" set in the call to blk_execute_rq_nowait()). This is bad in situation where the queues are full, requests will execute out of order, and can cause starvation of the first submitted requests. The sg_io_v4->flags member is used and a bit is allocated to denote the Q_AT_TAIL. Zero is to queue at_head as before, to be compatible with old code at the write/read path. SG_IO code path behavior was changed so to be the same as write/read behavior. SG_IO was very rarely used and breaking compatibility with it is OK at this stage. sg_io_hdr at sg.h also has a flags member and uses 3 bits from the first nibble and one bit from the last nibble. Even though none of these bits are supported by bsg, The second nibble is allocated for use by bsg. Just in case. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> CC: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | | block: get rid of unused blkdev_free_rq() defineJens Axboe2009-03-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | | block: remove various blk_queue_*() setting functions in blk_init_queue_node()Jens Axboe2009-03-241-6/+3
| |_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | It calls blk_queue_make_request(), which sets the identical set of limits. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | block: fix missing bio back/front segment size setting in blk_recount_segments()Jens Axboe2009-03-061-16/+9
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 1e42807918d17e8c93bf14fbb74be84b141334c1 introduced a bug where we don't get front/back segment sizes in the bio in blk_recount_segments(). Fix this by tracking the back bio as well as the front bio in __blk_recalc_rq_segments(), this also cleans up the interface by getting rid of the segment size pointer passing. Tested-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | block: reduce stack footprint of blk_recount_segments()Jens Axboe2009-02-261-41/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blk_recalc_rq_segments() requires a request structure passed in, which we don't have from blk_recount_segments(). So the latter allocates one on the stack, using > 400 bytes of stack for that. This can cause us to spill over one page of stack from ext4 at least: 0) 4560 400 blk_recount_segments+0x43/0x62 1) 4160 32 bio_phys_segments+0x1c/0x24 2) 4128 32 blk_rq_bio_prep+0x2a/0xf9 3) 4096 32 init_request_from_bio+0xf9/0xfe 4) 4064 112 __make_request+0x33c/0x3f6 5) 3952 144 generic_make_request+0x2d1/0x321 6) 3808 64 submit_bio+0xb9/0xc3 7) 3744 48 submit_bh+0xea/0x10e 8) 3696 368 ext4_mb_init_cache+0x257/0xa6a [ext4] 9) 3328 288 ext4_mb_regular_allocator+0x421/0xcd9 [ext4] 10) 3040 160 ext4_mb_new_blocks+0x211/0x4b4 [ext4] 11) 2880 336 ext4_ext_get_blocks+0xb61/0xd45 [ext4] 12) 2544 96 ext4_get_blocks_wrap+0xf2/0x200 [ext4] 13) 2448 80 ext4_da_get_block_write+0x6e/0x16b [ext4] 14) 2368 352 mpage_da_map_blocks+0x7e/0x4b3 [ext4] 15) 2016 352 ext4_da_writepages+0x2ce/0x43c [ext4] 16) 1664 32 do_writepages+0x2d/0x3c 17) 1632 144 __writeback_single_inode+0x162/0x2cd 18) 1488 96 generic_sync_sb_inodes+0x1e3/0x32b 19) 1392 16 sync_sb_inodes+0xe/0x10 20) 1376 48 writeback_inodes+0x69/0xb3 21) 1328 208 balance_dirty_pages_ratelimited_nr+0x187/0x2f9 22) 1120 224 generic_file_buffered_write+0x1d4/0x2c4 23) 896 176 __generic_file_aio_write_nolock+0x35f/0x393 24) 720 80 generic_file_aio_write+0x6c/0xc8 25) 640 80 ext4_file_write+0xa9/0x137 [ext4] 26) 560 320 do_sync_write+0xf0/0x137 27) 240 48 vfs_write+0xb3/0x13c 28) 192 64 sys_write+0x4c/0x74 29) 128 128 system_call_fastpath+0x16/0x1b Split the segment counting out into a __blk_recalc_rq_segments() helper to avoid allocating an onstack request just for checking the physical segment count. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | block: add documentation for register_blkdev()Márton Németh2009-02-261-0/+16
|/ / / | | | | | | | | | | | | | | | | | | | | | Add documentation for register_blkdev() function and for the parameters. Signed-off-by: Márton Németh <nm127@freemail.hu> Cc: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | block: fix deadlock in blk_abort_queue() for drivers that readd to timeout listHannes Reinecke2009-02-181-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | blk_abort_queue() iterates the timeout list and aborts each request on the list, but if the driver error handling readds a request to the timeout list during this processing, we could be looping forever. Fix this by splicing current entries to a local list and run over that list instead. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | block: fix booting from partitioned md arrayNeil Brown2009-02-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hi Tejun, it looks like your commit: block: don't depend on consecutive minor space f331c0296f2a9fee0d396a70598b954062603015 broke a particular case for booting from partitioned md/raid devices. That is the second time this has been broken recently. The previous time was fixed by block: do_mounts - accept root=<non-existant partition> 30f2f0eb4bd2c43d10a8b0d872c6e5ad8f31c9a0 Because the data isn't available when an md device is first created (we add disks and set it up after creation), the initial partition scan finds nothing. It is not until the device is opened that another partition scan happens and finds something. So at the point where the kernel parameter "root=/dev/md_d0p1" is being parsed, md_d0 exists, but md_d0p1 does not. However if we let blk_lookup_devt return the correct device number even though the device doesn't exist, then the attempt to mount it will successfully find the partition. I have tried in the past to find a way to get the partition table to be read as soon as the array is assembled but that proved impossible (at the time). I don't remember the details, and could possibly revisit it. However it would be really nice if blk_lookup_devt could be adjusted to again accept non existant partitions. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | block: fix bad definition of BIO_RW_SYNCJens Axboe2009-02-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | We can't OR shift values, so get rid of BIO_RW_SYNC and use BIO_RW_SYNCIO and BIO_RW_UNPLUG explicitly. This brings back the behaviour from before 213d9417fec62ef4c3675621b9364a667954d4dd. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | bsg: Fix sense buffer bug in SG_IOBoaz Harrosh2009-02-181-7/+10
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When submitting requests via SG_IO, which does a sync io, a bsg_command is not allocated. So an in-Kernel sense_buffer was not set. However when calling blk_execute_rq() with no sense buffer one is provided from the stack. Now bsg at blk_complete_sgv4_hdr_rq() would check if rq->sense_len and a sense was requested by sg_io_v4 the rq->sense was copy_user() back, but by now it is already mangled stack memory. I have fixed that by forcing a sense_buffer when calling bsg_map_hdr(). The bsg_command->sense is provided in the write/read path like before, and on-the-stack buffer is provided when doing SG_IO. I have also fixed a dprintk message to print rq->errors in hex because of the scsi bit-field use of this member. For other block devices it does not matter anyway. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | block: fix oops in blk_queue_io_stat()Jens Axboe2009-02-022-3/+11
| | | | | | | | | | | | | | | | Some initial probe requests don't have disk->queue mapped yet, so we can't rely on a non-NULL queue in blk_queue_io_stat(). Wrap it in blk_do_io_stat(). Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | cfq-iosched: Allow RT requests to pre-empt ongoing BE timesliceDivyesh Shah2009-01-301-1/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the ability to pre-empt an ongoing BE timeslice when a RT request is waiting for the current timeslice to complete. This reduces the wait time to disk for RT requests from an upper bound of 4 (current value of cfq_quantum) to 1 disk request. Applied Jens' suggeested changes to avoid the rb lookup and use !cfq_class_rt() and retested. Latency(secs) for the RT task when doing sequential reads from 10G file. | only RT | RT + BE | RT + BE + this patch small (512 byte) reads | 143 | 163 | 145 large (1Mb) reads | 142 | 158 | 146 Signed-off-by: Divyesh Shah <dpshah@google.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | block: add sysfs file for controlling io stats accountingJens Axboe2009-01-302-36/+82
| | | | | | | | | | | | | | This allows us to turn off disk stat accounting completely, for the cases where the 0.5-1% reduction in system time is important. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | block: silently error an unsupported barrier bioJens Axboe2009-01-301-0/+5
| | | | | | | | | | | | | | This fixes a "regression" from 2.6.28, where the barrier probes that file systems may do would trigger additional end request warnings in dmesg. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | block: Fix documentation for blkdev_issue_flush()Theodore Ts'o2009-01-301-1/+1
| | | | | | | | | | Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | block: seperate bio/request unplug and sync bitsJens Axboe2009-01-301-1/+4
| | | | | | | | Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | block: export SSD/non-rotational queue flag through sysfsBartlomiej Zolnierkiewicz2009-01-301-1/+29
| | | | | | | | | | | | | | | | | | | | | | | | For some devices (i.e. CFA ATA) we can't reliably detect whether the device is of rotational or non-rotational type so we need to leave the final decision about this setting to the user-space. As a bonus do a minor CodingStyle fixup in queue_nomerges_store(). Suggested-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | block: get rid of the manual directory counting in blktraceJens Axboe2009-01-301-51/+21
| | | | | | | | | | | | It can result in a stuck blktrace system, if --kill is used. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | block: Allow empty integrity profileMartin K. Petersen2009-01-301-11/+14
|/ | | | | | | | | Allow a block device to allocate and register an integrity profile without providing a template. This allows DM to preallocate a profile to avoid deadlocks during table reconfiguration. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* Merge branch 'for_linus' of ↵Linus Torvalds2009-01-091-0/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (57 commits) jbd2: Fix oops in jbd2_journal_init_inode() on corrupted fs ext4: Remove "extents" mount option block: Add Kconfig help which notes that ext4 needs CONFIG_LBD ext4: Make printk's consistently prefixed with "EXT4-fs: " ext4: Add sanity checks for the superblock before mounting the filesystem ext4: Add mount option to set kjournald's I/O priority jbd2: Submit writes to the journal using WRITE_SYNC jbd2: Add pid and journal device name to the "kjournald2 starting" message ext4: Add markers for better debuggability ext4: Remove code to create the journal inode ext4: provide function to release metadata pages under memory pressure ext3: provide function to release metadata pages under memory pressure add releasepage hooks to block devices which can be used by file systems ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc ext4: Init the complete page while building buddy cache ext4: Don't allow new groups to be added during block allocation ext4: mark the blocks/inode bitmap beyond end of group as used ext4: Use new buffer_head flag to check uninit group bitmaps initialization ext4: Fix the race between read_inode_bitmap() and ext4_new_inode() ext4: code cleanup ...
| * block: Add Kconfig help which notes that ext4 needs CONFIG_LBDTheodore Ts'o2009-01-061-0/+6
| | | | | | | | | | Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Jens Axboe <jens.axboe@oracle.com>