CoreDump and Debug Tools

Kernel Core Dump and analysis
To analyze dump, need uncompressed copy of kernel, vmlinux, not vmlinuz.
Core dump contains snapshot of crashed process stack which contains:
* Function parameters, Frame pointer, Return address, Local variables.

To generate core dump
1. Make core file limit to unlimited
# ulimit -c unlimited
2. Change core file name by echo to /proc/sys/kernel/core_pattern
# mkdir /tmp/core
# echo "/tmp/core/core" > /proc/sys/kernel/core_pattern
# echo 1>/proc/sys/kernel/core_uses_pid
3. Build with debug enable in gdb
# gcc -g -o core_out crash.c
This options generates unstripped output which contains all debugging information.
4. Execute crashing program and check core output
# ./core_out
The core dump shall be placed in /tmp/core/core.pid

Debug a crashing program
cat crash.c
#include "stdio.h"
main() { int *t=NULL;  printf("%d\n", *t);  }


# gdb core_out /tmp/core/core.65714
Reading symbols from /home/prasanta/study/core_out...done.

warning: exec file is newer than core file.
[New Thread 65714]
Missing separate debuginfo for
Try: yum --enablerepo='*-debug*' install /usr/lib/debug/.build-id/08/e42c6c3d2cd1e5d68a43b717c9eb3d310f2df0
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `./core_out'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000000004004d8 in main () at crash.c:7
7       printf("%d\n", *t);

If there were multiple threads, to see thread specific information:
(gdb) thread apply all bt

Thread 1 (Thread 65714):
#0  0x00000000004004d8 in main () at crash.c:7

To switch to a thread
(gdb) thread 1
[Switching to thread 1 (Thread 65714)]#0  0x00000000004004d8 in main () at crash.c:7
7       printf("%d\n", *t);

(gdb) bt
#0  0x00000000004004d8 in main () at crash.c:7

Go to particular frame, say #0 as above
(gdb) frame 0
#0  0x00000000004004d8 in main () at crash.c:7
7       printf("%d\n", *t);

See source code around that stack frame
(gdb) list +
2
3       main()
4       {
5
6       int *t=NULL;
7       printf("%d\n", *t);
8
9       }

Print variables
(gdb) print t
$1 = (int *) 0x0

Debug a hang program
In similar way, common hang programs such as mutex deadlock can be debugged.
# ps -aux|grep pgmname
# kill -11 pid
The rest flow is same as explained above, like find thread causing hang, change to that thread, find frame, see code and debug.

FTRACE

Documentation/trace/ftrace.txt
Requires below CONFIG in .config

CONFIG_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_STACKTRACE=y


ls /sys/kernel/debug/tracing
README                      current_tracer            kprobe_profile  set_event           stack_max_size         trace_marker     tracing_enabled
available_events            dyn_ftrace_total_info     options         set_ftrace_filter   stack_trace            trace_options    tracing_max_latency
available_filter_functions  events                    per_cpu         set_ftrace_notrace  sysprof_sample_period  trace_pipe       tracing_on
available_tracers           function_profile_enabled  printk_formats  set_ftrace_pid      trace                  trace_stat       tracing_thresh
buffer_size_kb              kprobe_events             saved_cmdlines  set_graph_function  trace_clock            tracing_cpumask


Ftrace appears in debugfs. If not, mount as below:
# mount -t debugfs nodev /sys/kernel/debug

Enable tracing for 10 sec and check trace output
# cd /sys/kernel/debug/tracing
# echo nop>current_tracer
# echo 0>tracing_on
# echo -1>trace

# echo function > current_tracer
# echo 1>tracing_on
# sleep 10
# echo 0>tracing_on   

cat trace
# tracer: nop
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |

* trace-cmd and kernelshark are tools used to make ftrace interpretation easier.



No comments:

Post a Comment