Kernel Core Dump and analysis
To analyze dump, need uncompressed copy of kernel, vmlinux, not vmlinuz.
Core dump contains snapshot of crashed process stack which contains:
* Function parameters, Frame pointer, Return address, Local variables.
To generate core dump
1. Make core file limit to unlimited
# ulimit -c unlimited
2. Change core file name by echo to /proc/sys/kernel/core_pattern
# mkdir /tmp/core
# echo "/tmp/core/core" > /proc/sys/kernel/core_pattern
# echo 1>/proc/sys/kernel/core_uses_pid
3. Build with debug enable in gdb
# gcc -g -o core_out crash.c
This options generates unstripped output which contains all debugging information.
4. Execute crashing program and check core output
# ./core_out
The core dump shall be placed in /tmp/core/core.pid
Debug a crashing program
cat crash.c
#include "stdio.h"
main() { int *t=NULL; printf("%d\n", *t); }
# gdb core_out /tmp/core/core.65714
Reading symbols from /home/prasanta/study/core_out...done.
warning: exec file is newer than core file.
[New Thread 65714]
Missing separate debuginfo for
Try: yum --enablerepo='*-debug*' install /usr/lib/debug/.build-id/08/e42c6c3d2cd1e5d68a43b717c9eb3d310f2df0
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `./core_out'.
Program terminated with signal 11, Segmentation fault.
#0 0x00000000004004d8 in main () at crash.c:7
7 printf("%d\n", *t);
If there were multiple threads, to see thread specific information:
(gdb) thread apply all bt
Thread 1 (Thread 65714):
#0 0x00000000004004d8 in main () at crash.c:7
To switch to a thread
(gdb) thread 1
[Switching to thread 1 (Thread 65714)]#0 0x00000000004004d8 in main () at crash.c:7
7 printf("%d\n", *t);
(gdb) bt
#0 0x00000000004004d8 in main () at crash.c:7
Go to particular frame, say #0 as above
(gdb) frame 0
#0 0x00000000004004d8 in main () at crash.c:7
7 printf("%d\n", *t);
See source code around that stack frame
(gdb) list +
2
3 main()
4 {
5
6 int *t=NULL;
7 printf("%d\n", *t);
8
9 }
Print variables
(gdb) print t
$1 = (int *) 0x0
Debug a hang program
In similar way, common hang programs such as mutex deadlock can be debugged.
# ps -aux|grep pgmname
# kill -11 pid
The rest flow is same as explained above, like find thread causing hang, change to that thread, find frame, see code and debug.
FTRACE
Documentation/trace/ftrace.txt
Requires below CONFIG in .config
CONFIG_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_STACKTRACE=y
ls /sys/kernel/debug/tracing
README current_tracer kprobe_profile set_event stack_max_size trace_marker tracing_enabled
available_events dyn_ftrace_total_info options set_ftrace_filter stack_trace trace_options tracing_max_latency
available_filter_functions events per_cpu set_ftrace_notrace sysprof_sample_period trace_pipe tracing_on
available_tracers function_profile_enabled printk_formats set_ftrace_pid trace trace_stat tracing_thresh
buffer_size_kb kprobe_events saved_cmdlines set_graph_function trace_clock tracing_cpumask
Ftrace appears in debugfs. If not, mount as below:
# mount -t debugfs nodev /sys/kernel/debug
# mount -t debugfs nodev /sys/kernel/debug
Enable tracing for 10 sec and check trace output
# cd /sys/kernel/debug/tracing
# echo nop>current_tracer
# echo 0>tracing_on
# echo -1>trace
# echo function > current_tracer
# echo 1>tracing_on
# sleep 10
# echo 0>tracing_on
cat trace
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
* trace-cmd and kernelshark are tools used to make ftrace interpretation easier.
No comments:
Post a Comment