浪潮云启操作系统(InLinux) bcache宕机问题分析

前言

本文以一次真实的内核宕机问题为切入点,结合实际操作案例,详细展示了如何利用工具 crash对内核转储(kdump)进行深入分析和调试的方法。通过对崩溃日志的解读、函数调用栈的梳理、关键地址的定位以及代码逻辑的排查,本文提供了一套系统化的内核问题分析思路和实用技巧。本指南基于 InLinux2312-LTS-SP1 版本,旨在帮助读者快速掌握内核 kdump 问题的排查方法,提升故障处理效率。

浪潮云启操作系统(InLinux)版本

以下操作步骤均基于InLinux2312-LTS-SP1版本,在此版本上进行问题分析。
在这里插入图片描述

问题分析过程

问题现象:

测试环境有3台服务器,服务器存储配置为2*6.4T NVMe+10*12T SATA盘,基于bcache做缓存加速配置,每块NVMe盘分了5分区,每个nvme分区作为1块12T SATA盘的cache device。
因为需要提高单台服务器的存储密度,所以将12T SATA盘更换为16T SATA盘。

现场操作步骤如下:

1、创建bache设备。

make-bcache -C /dev/nvme2n1p1 -B /dev/sda --writeback --force --wipe-bcache

/dev/sda为12T的SATA盘。
/dev/nvme2n1p1为nvme盘的第一个分区。分区大小为1024G。
分区命令为 parted -s --align optimal /dev/nvme2n1 mkpart primary 2048s 1024GiB

共10块硬盘,2个nvme,将每个nvme分区成5个分区,共创建10个bcache设备。

2、在bcache0上执行fio测试

cat /home/script/run-fio-randrw.sh 
bcache_name=$1
if [ -z "${bcache_name}" ];then
    echo bcache_name is empty
    exit -1
fi

fio --filename=/dev/${bcache_name} --ioengine=libaio --rw=randrw --bs=4k --size=100% --iodepth=128 --numjobs=4 --direct=1 --name=randrw --group_reporting --runtime=30 --ramp_time=5 --lockmem=1G | tee -a ./randrw-iops_k1.log

多次执行bash run-fio-randrw.sh bcache0
2、 关机

poweroff

没有执行bcache数据清除操作

3、替换12T的SATA盘为16TSATA盘

关机后拔掉12T硬盘,替换成16T的硬盘。

4、调整nvme2n1分区大小为1536G
分区执行完触发kernel panic

parted -s --align optimal /dev/nvme2n1 mkpart primary 2048s 1536GiB
5、重启系统,不能正常进入系统。一直处于重启状态。
6、通过光盘进入rescue模式,清除nvme2n1p1 超级块信息后。再次重新启动后,可以正常进入系统。
wipefs -af /dev/nvme2n1p1
7、重新分区,再次触发kernel panic。
parted -s --align optimal /dev/nvme2n1 mkpart primary 2048s 1536GiB
在另外两台服务器上执行同样操作,未触发panic。
出问题的服务器,加上cache_set结构体的root为空判断后,能够正常进入系统。

日志分析

错误日志信息
[root@storage-aqkp-002 127.0.0.1-2024-11-10-11:47:37]# cat   vmcore-dmesg.txt  |grep bcache
[   21.365228] bcache: bch_journal_replay() journal replay done, 9 keys in 5 entries, seq 987460
[   21.382581] bcache: register_cache() registered cache device nvme3n1p4
[   21.524130] bcache: bch_journal_replay() journal replay done, 9 keys in 5 entries, seq 1019863
[   21.535174] bcache: register_cache() registered cache device nvme3n1p2
[   21.698388] bcache: bch_journal_replay() journal replay done, 9 keys in 5 entries, seq 1109121
[   21.708619] bcache: register_cache() registered cache device nvme3n1p3
[   21.868881] bcache: bch_journal_replay() journal replay done, 0 keys in 1 entries, seq 1127759
[   21.879083] bcache: register_cache() registered cache device nvme3n1p5
[   22.054332] bcache: bch_journal_replay() journal replay done, 9 keys in 5 entries, seq 1102627
[   22.064518] bcache: register_cache() registered cache device nvme3n1p1
[  249.369289] bcache: register_bcache() error : device already registered
[  249.369415] bcache: register_bcache() error : device already registered
[  249.370308] bcache: register_bcache() error : device already registered
[  249.370517] bcache: register_bcache() error : device already registered
[  249.371315] bcache: register_bcache() error : device already registered
[  359.459929]  nvme2n1:
[  359.473124]  nvme2n1: p1
[  359.618056] bcache: prio_read() bad csum reading priorities
[  359.624878] bcache: bch_cache_set_error() error on f774c122-6c02-469b-b798-ca53c10efa76: IO error reading priorities, disabling caching
[  359.638311] bcache: register_cache() error nvme2n1p1: failed to run cache set
[  359.646709] bcache: register_bcache() error : failed to register device
[  359.658968] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000200
[  359.669077] Mem abort info:
[  359.672871]   ESR = 0x96000044
[  359.676929]   EC = 0x25: DABT (current EL), IL = 32 bits
[  359.683221]   SET = 0, FnV = 0
[  359.687253]   EA = 0, S1PTW = 0
[  359.691368] Data abort info:
[  359.695212]   ISV = 0, ISS = 0x00000044
[  359.700003]   CM = 0, WnR = 1
[  359.703909] user pgtable: 4k pages, 48-bit VAs, pgdp=00002040022e2000
[  359.711284] [0000000000000200] pgd=0000000000000000, p4d=0000000000000000
[  359.719262] Internal error: Oops: 0000000096000044 [#1] SMP
[  359.725760] Modules linked in: xt_set ipt_rpfilter xt_multiport iptable_raw ip_set_hash_ip ip_set_hash_net ip_set ipip tunnel4 ip_tunnel veth xt_statistic xt_nat xt_addrtype ip6table_nat ip6_tables ipt
able_mangle xt_physdev xt_conntrack xt_comment xt_mark iptable_filter nf_conntrack_netlink nfnetlink sch_ingress iptable_nat xt_MASQUERADE ip_tables rbd ceph libceph dns_resolver overlay openvswitch nsh n
f_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c 8021q garp mrp bonding vfat fat dm_multipath rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_iser rdma_
cm iw_cm ib_cm libiscsi scsi_transport_iscsi hns_roce_hw_v2 ib_uverbs ib_core bcache dm_mod crc64 ipmi_ssif ses enclosure aes_ce_blk aes_ce_cipher realtek acpi_ipmi hisi_sas_v3_hw hibmc_drm ghash_ce hclge
 sha1_ce hisi_sas_main nvme drm_vram_helper hns3 ipmi_si drm_ttm_helper nvme_core libsas hnae3 ipmi_devintf ttm host_edma_drv sg scsi_transport_sas i2c_designware_platform
[  359.725845]  nfit
[  359.730936] bcache: register_bcache() error : device already registered
[  359.815384]  ipmi_msghandler i2c_designware_core hisi_uncore_ddrc_pmu hisi_uncore_hha_pmu hisi_uncore_l3c_pmu libnvdimm hisi_uncore_pmu sch_fq_codel br_netfilter bridge stp llc fuse ext4 mbcache jbd2 s
d_mod t10_pi ahci libahci sha2_ce sha256_arm64 sbsa_gwdt libata megaraid_sas(OE) aes_neon_bs aes_neon_blk crypto_simd cryptd
[  359.833119] bcache: register_bcache() error : device already registered
[  359.856792] CPU: 57 PID: 7773 Comm: kworker/57:2 Kdump: loaded Tainted: G           OE     5.10.0-202.0.0.115.ile2312sp1.aarch64 #1
[  359.856793] Hardware name: Enginetech EG920A-G20/BC82AMDDRA, BIOS 6.67 11/15/2023
[  359.856819] Workqueue: events cache_set_flush [bcache]
[  359.894922] pstate: 00400009 (nzcv daif +PAN -UAO -TCO BTYPE=--)
[  359.901919] pc : cache_set_flush+0x94/0x190 [bcache]
[  359.907876] lr : cache_set_flush+0x88/0x190 [bcache]
[  359.913815] sp : ffff800046373d50
[  359.918104] x29: ffff800046373d50 x28: 0000000000000000
[  359.924380] x27: ffff800012213c48 x26: ffffbe503baba218
[  359.930651] x25: ffff49cc48ca0808 x24: ffff49cc06674000
[  359.936916] x23: ffff49cc48ca0808 x22: ffff49cc48ca0000
[  359.943172] x21: ffff49cc48ca04a8 x20: 0000000000000000
[  359.949419] x19: 0000000000000200 x18: 0000000000000000
[  359.955662] x17: 0000000000000000 x16: ffffbe503a531760
[  359.961896] x15: 0000000000000004 x14: ffff49cc00004990
[  359.968123] x13: 0000000000000000 x12: ffff49cc3dd02a40
[  359.974342] x11: ffff49cc3dd02910 x10: ffff2a0c0040b6c2
[  359.980556] x9 : ffffbe503a591d88 x8 : ffff49cc3dd02938
[  359.986770] x7 : ffff49cc07f03a18 x6 : 0000000000000000
[  359.992977] x5 : ffff29cc59c16218 x4 : ffff49cc48ca0808
[  359.999182] x3 : 0000000000000000 x2 : ffff49cc48ca0808
[  360.004565] bcache: bch_journal_replay() journal replay done, 11 keys in 6 entries, seq 1096092
[  360.005380] x1 : ffff49cc48ca0808 x0 : 0000000000000001
[  360.016207] bcache: register_cache() registered cache device nvme2n1p3
[  360.022922] Call trace:
[  360.022934]  cache_set_flush+0x94/0x190 [bcache]
[  360.022946]  process_one_work+0x1d8/0x4e0
[  360.045082] bcache: register_bcache() error : device already registered
[  360.045966]  worker_thread+0x154/0x420
[  360.045970]  kthread+0x108/0x150
[  360.046495] bcache: register_bcache() error : device already registered
[  360.066044] bcache: register_bcache() error : device already registered
[  360.066162] bcache: register_bcache() error : device already registered
[  360.070249]  ret_from_fork+0x10/0x18
[  360.070254] Code: 940043e2 72001c1f 54000700 f90006f3 (f9010297)
[  360.090288] bcache: register_bcache() error : device already registered
[  360.091355] bcache: register_bcache() error : device already registered
[  360.097327] SMP: stopping secondary CPUs
[  360.119238] Starting crashdump kernel...
日志分析结果
代码正向分析

根据日志可以分析到问题函数调用栈

run_cache_set
register_cache_set