前言
本文以一次真实的内核宕机问题为切入点,结合实际操作案例,详细展示了如何利用工具 crash
对内核转储(kdump)进行深入分析和调试的方法。通过对崩溃日志的解读、函数调用栈的梳理、关键地址的定位以及代码逻辑的排查,本文提供了一套系统化的内核问题分析思路和实用技巧。本指南基于 InLinux2312-LTS-SP1 版本,旨在帮助读者快速掌握内核 kdump 问题的排查方法,提升故障处理效率。
浪潮云启操作系统(InLinux)版本
以下操作步骤均基于InLinux2312-LTS-SP1版本,在此版本上进行问题分析。
问题分析过程
问题现象:
测试环境有3台服务器,服务器存储配置为2*6.4T NVMe+10*12T SATA盘
,基于bcache做缓存加速配置,每块NVMe盘分了5分区,每个nvme分区作为1块12T SATA盘的cache device。
因为需要提高单台服务器的存储密度,所以将12T SATA盘更换为16T SATA盘。
现场操作步骤如下:
1、创建bache设备。
make-bcache -C /dev/nvme2n1p1 -B /dev/sda --writeback --force --wipe-bcache
/dev/sda为12T的SATA盘。
/dev/nvme2n1p1为nvme盘的第一个分区。分区大小为1024G。
分区命令为 parted -s --align optimal /dev/nvme2n1 mkpart primary 2048s 1024GiB
共10块硬盘,2个nvme,将每个nvme分区成5个分区,共创建10个bcache设备。
2、在bcache0上执行fio测试
cat /home/script/run-fio-randrw.sh
bcache_name=$1
if [ -z "${bcache_name}" ];then
echo bcache_name is empty
exit -1
fi
fio --filename=/dev/${bcache_name} --ioengine=libaio --rw=randrw --bs=4k --size=100% --iodepth=128 --numjobs=4 --direct=1 --name=randrw --group_reporting --runtime=30 --ramp_time=5 --lockmem=1G | tee -a ./randrw-iops_k1.log
多次执行bash run-fio-randrw.sh bcache0
2、 关机
poweroff
没有执行bcache数据清除操作
3、替换12T的SATA盘为16TSATA盘
关机后拔掉12T硬盘,替换成16T的硬盘。
4、调整nvme2n1分区大小为1536G
分区执行完触发kernel panic
parted -s --align optimal /dev/nvme2n1 mkpart primary 2048s 1536GiB
5、重启系统,不能正常进入系统。一直处于重启状态。
6、通过光盘进入rescue模式,清除nvme2n1p1 超级块信息后。再次重新启动后,可以正常进入系统。
wipefs -af /dev/nvme2n1p1
7、重新分区,再次触发kernel panic。
parted -s --align optimal /dev/nvme2n1 mkpart primary 2048s 1536GiB
在另外两台服务器上执行同样操作,未触发panic。
出问题的服务器,加上cache_set结构体的root为空判断后,能够正常进入系统。
日志分析
错误日志信息
[root@storage-aqkp-002 127.0.0.1-2024-11-10-11:47:37]# cat vmcore-dmesg.txt |grep bcache
[ 21.365228] bcache: bch_journal_replay() journal replay done, 9 keys in 5 entries, seq 987460
[ 21.382581] bcache: register_cache() registered cache device nvme3n1p4
[ 21.524130] bcache: bch_journal_replay() journal replay done, 9 keys in 5 entries, seq 1019863
[ 21.535174] bcache: register_cache() registered cache device nvme3n1p2
[ 21.698388] bcache: bch_journal_replay() journal replay done, 9 keys in 5 entries, seq 1109121
[ 21.708619] bcache: register_cache() registered cache device nvme3n1p3
[ 21.868881] bcache: bch_journal_replay() journal replay done, 0 keys in 1 entries, seq 1127759
[ 21.879083] bcache: register_cache() registered cache device nvme3n1p5
[ 22.054332] bcache: bch_journal_replay() journal replay done, 9 keys in 5 entries, seq 1102627
[ 22.064518] bcache: register_cache() registered cache device nvme3n1p1
[ 249.369289] bcache: register_bcache() error : device already registered
[ 249.369415] bcache: register_bcache() error : device already registered
[ 249.370308] bcache: register_bcache() error : device already registered
[ 249.370517] bcache: register_bcache() error : device already registered
[ 249.371315] bcache: register_bcache() error : device already registered
[ 359.459929] nvme2n1:
[ 359.473124] nvme2n1: p1
[ 359.618056] bcache: prio_read() bad csum reading priorities
[ 359.624878] bcache: bch_cache_set_error() error on f774c122-6c02-469b-b798-ca53c10efa76: IO error reading priorities, disabling caching
[ 359.638311] bcache: register_cache() error nvme2n1p1: failed to run cache set
[ 359.646709] bcache: register_bcache() error : failed to register device
[ 359.658968] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000200
[ 359.669077] Mem abort info:
[ 359.672871] ESR = 0x96000044
[ 359.676929] EC = 0x25: DABT (current EL), IL = 32 bits
[ 359.683221] SET = 0, FnV = 0
[ 359.687253] EA = 0, S1PTW = 0
[ 359.691368] Data abort info:
[ 359.695212] ISV = 0, ISS = 0x00000044
[ 359.700003] CM = 0, WnR = 1
[ 359.703909] user pgtable: 4k pages, 48-bit VAs, pgdp=00002040022e2000
[ 359.711284] [0000000000000200] pgd=0000000000000000, p4d=0000000000000000
[ 359.719262] Internal error: Oops: 0000000096000044 [#1] SMP
[ 359.725760] Modules linked in: xt_set ipt_rpfilter xt_multiport iptable_raw ip_set_hash_ip ip_set_hash_net ip_set ipip tunnel4 ip_tunnel veth xt_statistic xt_nat xt_addrtype ip6table_nat ip6_tables ipt
able_mangle xt_physdev xt_conntrack xt_comment xt_mark iptable_filter nf_conntrack_netlink nfnetlink sch_ingress iptable_nat xt_MASQUERADE ip_tables rbd ceph libceph dns_resolver overlay openvswitch nsh n
f_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c 8021q garp mrp bonding vfat fat dm_multipath rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_iser rdma_
cm iw_cm ib_cm libiscsi scsi_transport_iscsi hns_roce_hw_v2 ib_uverbs ib_core bcache dm_mod crc64 ipmi_ssif ses enclosure aes_ce_blk aes_ce_cipher realtek acpi_ipmi hisi_sas_v3_hw hibmc_drm ghash_ce hclge
sha1_ce hisi_sas_main nvme drm_vram_helper hns3 ipmi_si drm_ttm_helper nvme_core libsas hnae3 ipmi_devintf ttm host_edma_drv sg scsi_transport_sas i2c_designware_platform
[ 359.725845] nfit
[ 359.730936] bcache: register_bcache() error : device already registered
[ 359.815384] ipmi_msghandler i2c_designware_core hisi_uncore_ddrc_pmu hisi_uncore_hha_pmu hisi_uncore_l3c_pmu libnvdimm hisi_uncore_pmu sch_fq_codel br_netfilter bridge stp llc fuse ext4 mbcache jbd2 s
d_mod t10_pi ahci libahci sha2_ce sha256_arm64 sbsa_gwdt libata megaraid_sas(OE) aes_neon_bs aes_neon_blk crypto_simd cryptd
[ 359.833119] bcache: register_bcache() error : device already registered
[ 359.856792] CPU: 57 PID: 7773 Comm: kworker/57:2 Kdump: loaded Tainted: G OE 5.10.0-202.0.0.115.ile2312sp1.aarch64 #1
[ 359.856793] Hardware name: Enginetech EG920A-G20/BC82AMDDRA, BIOS 6.67 11/15/2023
[ 359.856819] Workqueue: events cache_set_flush [bcache]
[ 359.894922] pstate: 00400009 (nzcv daif +PAN -UAO -TCO BTYPE=--)
[ 359.901919] pc : cache_set_flush+0x94/0x190 [bcache]
[ 359.907876] lr : cache_set_flush+0x88/0x190 [bcache]
[ 359.913815] sp : ffff800046373d50
[ 359.918104] x29: ffff800046373d50 x28: 0000000000000000
[ 359.924380] x27: ffff800012213c48 x26: ffffbe503baba218
[ 359.930651] x25: ffff49cc48ca0808 x24: ffff49cc06674000
[ 359.936916] x23: ffff49cc48ca0808 x22: ffff49cc48ca0000
[ 359.943172] x21: ffff49cc48ca04a8 x20: 0000000000000000
[ 359.949419] x19: 0000000000000200 x18: 0000000000000000
[ 359.955662] x17: 0000000000000000 x16: ffffbe503a531760
[ 359.961896] x15: 0000000000000004 x14: ffff49cc00004990
[ 359.968123] x13: 0000000000000000 x12: ffff49cc3dd02a40
[ 359.974342] x11: ffff49cc3dd02910 x10: ffff2a0c0040b6c2
[ 359.980556] x9 : ffffbe503a591d88 x8 : ffff49cc3dd02938
[ 359.986770] x7 : ffff49cc07f03a18 x6 : 0000000000000000
[ 359.992977] x5 : ffff29cc59c16218 x4 : ffff49cc48ca0808
[ 359.999182] x3 : 0000000000000000 x2 : ffff49cc48ca0808
[ 360.004565] bcache: bch_journal_replay() journal replay done, 11 keys in 6 entries, seq 1096092
[ 360.005380] x1 : ffff49cc48ca0808 x0 : 0000000000000001
[ 360.016207] bcache: register_cache() registered cache device nvme2n1p3
[ 360.022922] Call trace:
[ 360.022934] cache_set_flush+0x94/0x190 [bcache]
[ 360.022946] process_one_work+0x1d8/0x4e0
[ 360.045082] bcache: register_bcache() error : device already registered
[ 360.045966] worker_thread+0x154/0x420
[ 360.045970] kthread+0x108/0x150
[ 360.046495] bcache: register_bcache() error : device already registered
[ 360.066044] bcache: register_bcache() error : device already registered
[ 360.066162] bcache: register_bcache() error : device already registered
[ 360.070249] ret_from_fork+0x10/0x18
[ 360.070254] Code: 940043e2 72001c1f 54000700 f90006f3 (f9010297)
[ 360.090288] bcache: register_bcache() error : device already registered
[ 360.091355] bcache: register_bcache() error : device already registered
[ 360.097327] SMP: stopping secondary CPUs
[ 360.119238] Starting crashdump kernel...
日志分析结果
代码正向分析
根据日志可以分析到问题函数调用栈