diff options
author | Oliver O'Halloran <oohall@gmail.com> | 2019-09-03 12:15:56 +0200 |
---|---|---|
committer | Michael Ellerman <mpe@ellerman.id.au> | 2019-09-05 06:22:38 +0200 |
commit | 25baf3d81614b0b8ca8958f4d6f111ccaaaad578 (patch) | |
tree | 6c8e859c08e05b14b1c564282563f95b91da2d86 /arch/powerpc/include/asm/eeh.h | |
parent | powerpc/eeh: Check slot presence state in eeh_handle_normal_event() (diff) | |
download | linux-25baf3d81614b0b8ca8958f4d6f111ccaaaad578.tar.xz linux-25baf3d81614b0b8ca8958f4d6f111ccaaaad578.zip |
powerpc/eeh: Defer printing stack trace
Currently we print a stack trace in the event handler to help with
debugging EEH issues. In the case of suprise hot-unplug this is unneeded,
so we want to prevent printing the stack trace unless we know it's due to
an actual device error. To accomplish this, we can save a stack trace at
the point of detection and only print it once the EEH recovery handler has
determined the freeze was due to an actual error.
Since the whole point of this is to prevent spurious EEH output we also
move a few prints out of the detection thread, or mark them as pr_debug
so anyone interested can get output from the eeh_check_dev_failure()
if they want.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190903101605.2890-6-oohall@gmail.com
Diffstat (limited to 'arch/powerpc/include/asm/eeh.h')
-rw-r--r-- | arch/powerpc/include/asm/eeh.h | 11 |
1 files changed, 11 insertions, 0 deletions
diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h index c13119a5e69b..9d0e1694a94d 100644 --- a/arch/powerpc/include/asm/eeh.h +++ b/arch/powerpc/include/asm/eeh.h @@ -88,6 +88,17 @@ struct eeh_pe { struct list_head child_list; /* List of PEs below this PE */ struct list_head child; /* Memb. child_list/eeh_phb_pe */ struct list_head edevs; /* List of eeh_dev in this PE */ + + /* + * Saved stack trace. When we find a PE freeze in eeh_dev_check_failure + * the stack trace is saved here so we can print it in the recovery + * thread if it turns out to due to a real problem rather than + * a hot-remove. + * + * A max of 64 entries might be overkill, but it also might not be. + */ + unsigned long stack_trace[64]; + int trace_entries; }; #define eeh_pe_for_each_dev(pe, edev, tmp) \ |