开启Kdump捕捉内核崩溃信息
时间:2024-10-15 14:43 来源:linux.it.net.cn 作者:IT
首先安装必选包:
apt-get -y install aptitude kdump-tools crash kexec-tools makedumpfile linux-image-`uname -r`-dbg
aptitude full-upgrade # 避免运行的内核版本与调试的版本不一致导致无法调试
Kdump配置文件 /etc/default/kdump-tools 关键部分
USE_KDUMP=1
KDUMP_SYSCTL="kernel.panic_on_oops=1"
KDUMP_KERNEL=/boot/vmlinuz-3.16.0-4-amd64
KDUMP_INITRD=/boot/initrd.img-3.16.0-4-amd64
KDUMP_COREDIR="/data/crash"
KDUMP_FAIL_CMD="reboot -f"
DEBUG_KERNEL=/usr/lib/debug/vmlinux-3.16.0-4-amd64
MAKEDUMP_ARGS="-c -d 31"
KDUMP_CMDLINE="crashkernel=512M"
crashkernel大小配比,正常情况崩溃后1分钟左右会自动重启、配置不正确会导致重启卡住黑屏不动。
内存大小 crashkernel=
0 - 12G 128M
13 - 48G 256M
49 - 128G 512M
129 - 256G 1G *(896M, 768M o或 512M)
Grub配置文件/etc/default/grub关键部分
GRUB_CMDLINE_LINUX_DEFAULT="nmi_watchdog=1 crashkernel=512M"
Sysctl配置文件/etc/sysctl.conf关键部分
kernel.sysrq = 1
kernel.watchdog = 1
kernel.nmi_watchdog = 1
kernel.panic_on_oops = 1
kernel.softlockup_panic = 1
kernel.watchdog_thresh = 10
重启让配置生效
update-grub
reboot -f
kdump-config load
kdump-config show
校验
# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-3.16.0-4-amd64 root=UUID=a58ab901-00aa-4f8b-b3eb-d352fc72233 ro net.ifnames=0 thash_entries=1048576 rhash_entries=1048576 biosdevname=0 nohz=off enforcing=0 ipv6.disable_ipv6=1 nmi_watchdog=1 selinux=0 transparent_hugepage=never cgroup_enable=memory swapaccount=1 vga=771 crashkernel=512M
# kdump-config test
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /data/crash
crashkernel addr: 0x2e000000
kdump kernel addr:
kdump kernel:
/boot/vmlinuz-3.16.0-4-amd64
kdump initrd:
/boot/initrd.img-3.16.0-4-amd64
debug kernel:
/usr/lib/debug/vmlinux-3.16.0-4-amd64
kexec command to be used:
/sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-3.16.0-4-amd64 root=UUID=a58ab901-00aa-4f8b-b3eb-d352fc7f6acb ro net.ifnames=0 thash_entries=1048576 rhash_entries=1048576 biosdevname=0 nohz=off enforcing=0 ipv6.disable_ipv6=1 nmi_watchdog=1 selinux=0 transparent_hugepage=never cgroup_enable=memory swapaccount=1 vga=771 irqpoll maxcpus=1 nousb systemd.unit=kdump-tools.service crashkernel=512M" --initrd=/boot/initrd.img-3.16.0-4-amd64 /boot/vmlinuz-3.16.0-4-amd64
测试
echo c > /proc/sysrq-trigger
崩溃演示
分析
crash /data/crash/201609010252/dump.201609012233 /usr/lib/debug/lib/modules/3.16.0-4-amd64/vmlinux
开始欢快的调试之旅
help一下你就知道
常用错误 :
WARNING: kernel version inconsistency between vmlinux and dumpfile # 版本不一致,full-upgrade 一下。
崩溃后系统不重启。 # 请检查 sysctl 、 crashkernel 及 nmi_watchdog 的设定! (责任编辑:IT)
首先安装必选包: apt-get -y install aptitude kdump-tools crash kexec-tools makedumpfile linux-image-`uname -r`-dbg aptitude full-upgrade # 避免运行的内核版本与调试的版本不一致导致无法调试 Kdump配置文件 /etc/default/kdump-tools 关键部分 USE_KDUMP=1 KDUMP_SYSCTL="kernel.panic_on_oops=1" KDUMP_KERNEL=/boot/vmlinuz-3.16.0-4-amd64 KDUMP_INITRD=/boot/initrd.img-3.16.0-4-amd64 KDUMP_COREDIR="/data/crash" KDUMP_FAIL_CMD="reboot -f" DEBUG_KERNEL=/usr/lib/debug/vmlinux-3.16.0-4-amd64 MAKEDUMP_ARGS="-c -d 31" KDUMP_CMDLINE="crashkernel=512M" crashkernel大小配比,正常情况崩溃后1分钟左右会自动重启、配置不正确会导致重启卡住黑屏不动。 内存大小 crashkernel= 0 - 12G 128M 13 - 48G 256M 49 - 128G 512M 129 - 256G 1G *(896M, 768M o或 512M) Grub配置文件/etc/default/grub关键部分 GRUB_CMDLINE_LINUX_DEFAULT="nmi_watchdog=1 crashkernel=512M" Sysctl配置文件/etc/sysctl.conf关键部分 kernel.sysrq = 1 kernel.watchdog = 1 kernel.nmi_watchdog = 1 kernel.panic_on_oops = 1 kernel.softlockup_panic = 1 kernel.watchdog_thresh = 10 重启让配置生效 update-grub reboot -f kdump-config load kdump-config show 校验 # cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-3.16.0-4-amd64 root=UUID=a58ab901-00aa-4f8b-b3eb-d352fc72233 ro net.ifnames=0 thash_entries=1048576 rhash_entries=1048576 biosdevname=0 nohz=off enforcing=0 ipv6.disable_ipv6=1 nmi_watchdog=1 selinux=0 transparent_hugepage=never cgroup_enable=memory swapaccount=1 vga=771 crashkernel=512M # kdump-config test USE_KDUMP: 1 KDUMP_SYSCTL: kernel.panic_on_oops=1 KDUMP_COREDIR: /data/crash crashkernel addr: 0x2e000000 kdump kernel addr: kdump kernel: /boot/vmlinuz-3.16.0-4-amd64 kdump initrd: /boot/initrd.img-3.16.0-4-amd64 debug kernel: /usr/lib/debug/vmlinux-3.16.0-4-amd64 kexec command to be used: /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-3.16.0-4-amd64 root=UUID=a58ab901-00aa-4f8b-b3eb-d352fc7f6acb ro net.ifnames=0 thash_entries=1048576 rhash_entries=1048576 biosdevname=0 nohz=off enforcing=0 ipv6.disable_ipv6=1 nmi_watchdog=1 selinux=0 transparent_hugepage=never cgroup_enable=memory swapaccount=1 vga=771 irqpoll maxcpus=1 nousb systemd.unit=kdump-tools.service crashkernel=512M" --initrd=/boot/initrd.img-3.16.0-4-amd64 /boot/vmlinuz-3.16.0-4-amd64 测试 echo c > /proc/sysrq-trigger 崩溃演示 分析 crash /data/crash/201609010252/dump.201609012233 /usr/lib/debug/lib/modules/3.16.0-4-amd64/vmlinux 开始欢快的调试之旅 help一下你就知道 常用错误 : WARNING: kernel version inconsistency between vmlinux and dumpfile # 版本不一致,full-upgrade 一下。 崩溃后系统不重启。 # 请检查 sysctl 、 crashkernel 及 nmi_watchdog 的设定! (责任编辑:IT) |