Enable kernel crash dumps

From Wiki
Jump to: navigation, search

How to enable CentOS/RHEL 5 crash dump

Ever seen Linux on a VM with kernel Oops and panic, and nothing in the system logs?

If you haven`t you`re a lucky guy, howerver be advised - enable crash dumps if you really need to find what happened.

yum install --enablerepo=debug kexec-tools crash kernel-debuginfo


(or yum install --enablerepo=rhel-debuginfo kexec-tools crash kernel-debuginfo if using RHEL5)


Tune /etc/kdump.conf. To enable the dump file compression, add the -c parameter, to remove zero pages and free pages add -d 17: core_collector makedumpfile -d 17 -c

Update /boot/grub/grub.conf with the memory to be reserved for the crash kernel:

grubby --update-kernel=ALL --args="crashkernel=128M@16M"

Enable the kdump service: chkconfig kdump on

Reboot: shutdown -r now

In case of a system crash, Kexec will boot to the capture kernel without clearing the crashed kernel memory and then pass the control to this kernel. Kdump, in its turn, will capture the dump and put it into a sudir of /var/crash directory, named with date and time dump was created.

Verify you have reserved memory for the crash kernel

  1. cat /proc/iomem |grep Crash

01000000-08ffffff : Crash kernel

Now time to panic your kernel: - first issue SysRq echo "1" > /proc/sys/kernel/sysrq

The magic SysRq key is a key combination understood by the Linux kernel, which allows the user to perform various low level commands regardless of the system's state. It is often used to recover from freezes, or to reboot a computer without corrupting the filesystem. If your linux freezed and it`s unresponsive if you have a serial type of connection you can execute key combination of Alt+SysRq+c or Alt+PrintScreen+c to reboot kexec and output a crashdump.

- trigger kernel crash dump echo "c" > /proc/sysrq-trigger

Kexec will boot the crash kernel and create the core dump in the default location /var/crash after that OS will reboot. Next thing you can do is to analyze the core dump:

[root@localhost 2012-04-23-00:07]# crash vmcore /usr/lib/debug/lib/modules/2.6.18-308.1.1.el5.centos.plus/vmlinux

crash 5.1.8-1.el5.centos Copyright (C) 2002-2011 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb (GDB) 7.0 Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu"...

WARNING: kernel version inconsistency between vmlinux and dumpfile

KERNEL: /usr/lib/debug/lib/modules/2.6.18-308.1.1.el5.centos.plus/vmlinux DUMPFILE: vmcore CPUS: 4 DATE: Mon Apr 23 00:07:30 2012 UPTIME: 00:00:00 LOAD AVERAGE: 0.06, 0.10, 0.04 TASKS: 82 NODENAME: localhost.localdomain RELEASE: 2.6.18-308.el5 VERSION: #1 SMP Tue Feb 21 20:06:06 EST 2012 MACHINE: x86_64 (3270 Mhz) MEMORY: 3.9 GB PANIC: "SysRq : Trigger a crashdump" PID: 3200 COMMAND: "bash" TASK: ffff81011daf60c0 [THREAD_INFO: ffff810111b52000] CPU: 2 STATE: TASK_RUNNING (SYSRQ)

The interactive crash shell provides some useful utilities like: - log - To display the kernel message buffer (what used to see with dmesg), type the log command at the interactive prompt - bt - To display the kernel stack trace, type the bt command at the interactive prompt. You can use bt pid to display the backtrace of the selected process. - ps - To display a status of processes in the system, type the ps command at the interactive prompt. You can use ps pid to display the status of the selected process. - vm - To display basic virtual memory information, type the vm command at the interactive prompt. You can use vm pid to display information on the selected process. - files - To display information about open files, type the files command at the interactive prompt. You can use files pid to display files opened by the selected process.

More details can be found at:

http://en.wikipedia.org/wiki/Magic_SysRq_key

Suspending a virtual machine on ESX/ESXi to collect diagnostic information: http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=2005831

RHEL 5 Deployment Guide, Chapter 44. The kdump Crash Recovery Service: http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Deployment_Guide/ch-kdump.html