开发环境 :
处理器: AM335x
SDK:06_03_00_106
一、问题日
[ 31.256321] UBIFS error (ubi0:0 pid 684): ubifs_iget: failed to read inode 11731, error -2
[ 31.265311] UBIFS error (ubi0:0 pid 684): ubifs_lookup: dead directory entry 'core.test.0.d8ab38090e5b49faa8f42f345372ca25.610.1587262394000000.xz', error -2
[ 31.281906] UBIFS warning (ubi0:0 pid 684): ubifs_ro_mode.part.0: switched to read-only mode, error -2
[ 31.292427] CPU: 0 PID: 684 Comm: systemd-coredum Not tainted 4.19.94-00002-gce0d638862-dirty #510
[ 31.302503] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 31.309567] Backtrace:
[ 31.312218] [<c010cb64>] (dump_backtrace) from [<c010ced4>] (show_stack+0x18/0x1c)
[ 31.321217] r7:ceab6528 r6:cec80330 r5:c0d03048 r4:00000040
[ 31.327735] [<c010cebc>] (show_stack) from [<c086ef4c>] (dump_stack+0x24/0x28)
[ 31.335352] [<c086ef28>] (dump_stack) from [<c039fd0c>] (ubifs_ro_mode.part.0+0x48/0x4c)
[ 31.344856] [<c039fcc4>] (ubifs_ro_mode.part.0) from [<c039fe0c>] (ubifs_ro_mode+0x1c/0x20)
[ 31.354252] r5:c0d03048 r4:c9e6be00
[ 31.358543] [<c039fdf0>] (ubifs_ro_mode) from [<c0399538>] (ubifs_lookup+0x278/0x308)
[ 31.367307] [<c03992c0>] (ubifs_lookup) from [<c025a3e8>] (__lookup_slow+0x90/0x184)
[ 31.375451] r10:ce397dd0 r9:00000000 r8:ceada5d8 r7:ceab6528 r6:ce397cdc r5:c0d03048
[ 31.384688] r4:cec80330
[ 31.387907] [<c025a358>] (__lookup_slow) from [<c025a514>] (lookup_slow+0x38/0x4c)
[ 31.395883] r10:00000147 r9:ce397ebc r8:00000040 r7:00000000 r6:ce397dd0 r5:ceada5d8
[ 31.405103] r4:ceab65a4
[ 31.408333] [<c025a4dc>] (lookup_slow) from [<c025ad2c>] (walk_component+0x1dc/0x308)
[ 31.417203] r7:00000000 r6:00000000 r5:c0d03048 r4:ce397dc8
[ 31.423182] [<c025ab50>] (walk_component) from [<c025d51c>] (path_lookupat+0x78/0x204)
[ 31.432423] r10:00000147 r9:ce397ebc r8:00000040 r7:ce397ebc r6:c0d03048 r5:00000000
[ 31.441178] r4:ce397dc8
[ 31.443861] [<c025d4a4>] (path_lookupat) from [<c025f4f8>] (filename_lookup.part.13+0x9c/0x10c)
[ 31.453914] r8:00000000 r7:cf084000 r6:ce397dc8 r5:00000006 r4:c0d03048
[ 31.461550] [<c025f45c>] (filename_lookup.part.13) from [<c025f67c>] (user_path_at_empty+0x54/0x5c)
[ 31.471582] r9:ce397f00 r8:00000000 r7:00440503 r6:ce397ebc r5:00000006 r4:00000000
[ 31.480239] [<c025f628>] (user_path_at_empty) from [<c0253e28>] (vfs_statx+0x74/0xe0)
[ 31.489296] r6:00000006 r5:00000900 r4:00000000
[ 31.494184] [<c0253db4>] (vfs_statx) from [<c02547f0>] (sys_fstatat64+0x3c/0x6c)
[ 31.502891] r10:00000147 r9:ce396000 r8:c0101204 r7:00000147 r6:00440503 r5:be9b9900
[ 31.511691] r4:c0d03048
[ 31.514373] [<c02547b4>] (sys_fstatat64) from [<c0101000>] (ret_fast_syscall+0x0/0x54)
[ 31.523635] Exception stack(0xce397fa8 to 0xce397ff0)
[ 31.529473] 7fa0: 00000000 0005a39b 00000006 00440503 be9b9900 00000900
[ 31.538816] 7fc0: 00000000 0005a39b 00440503 00000147 0043d400 b6f6b1f4 0043d400 00440440
[ 31.548103] 7fe0: 00439fa0 be9b984c 00427467 b6cff88c
[ 31.553461] r5:0005a39b r4:00000000
[ 32.053698] UBIFS error (ubi0:0 pid 690): ubifs_iget: failed to read inode 11731, error -2
[ 32.062694] UBIFS error (ubi0:0 pid 690): ubifs_lookup: dead directory entry 'core.test.0.d8ab38090e5b49faa8f42f345372ca25.610.1587262394000000.xz', error -2
二、产生问题原因
主要是文件系统开始了systemd-coredump 功能,当系统断电后,会在 /var/lib/systemd/coredump 目录下生成核心转储文件( core.test.0.d8ab38090e5b49faa8f42f345372ca25.610.1587262394000000.xz ),由于该文件的一个inode节点还未提交到TNC子系统,导致该节点变成一个孤儿节点,系统再次上电后会检查孤儿节点并删除,从而导致这个问题的产生。
三、解决方法
1、禁用文件系统coredump功能(治标不治本)
修改 /etc/systemd/coredum.conf 文件,添加字段 [Storage=none]
2、修改内核文件系统代码
(1)代码patch
@@ -691,8 +691,21 @@ static int do_kill_orphans(struct ubifs_info *c, struct ubifs_scan_leb *sleb,
n = (le32_to_cpu(orph->ch.len) - UBIFS_ORPH_NODE_SZ) >> 3;
for (i = 0; i < n; i++) {
union ubifs_key key1, key2;
+ struct ubifs_ino_node *ino;
+
+ ino = kmalloc(UBIFS_MAX_INO_NODE_SZ, GFP_NOFS);
+ if (!ino)
+ return -ENOMEM;
inum = le64_to_cpu(orph->inos[i]);
+ ino_key_init(c, &key1, inum);
+ err = ubifs_tnc_lookup(c, &key1, ino);
+ if (!err && ino->nlink) {
+ kfree(ino);
+ continue;
+ }
+ kfree(ino);
+
dbg_rcvry("deleting orphaned inode %lu",
(unsigned long)inum);
参考链接:
1、https://www.spinics.net/lists/linux-mtd/msg06319.html
2、https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/fs/ubifs/orphan.c?id=ee1438ce5dc4d67dd8dd1ff51583122a61f5bd9e
(2) 程序流程调用
ubifs_mount_orphans-->
kill_orphans-->
do_kill_orphans
int ubifs_mount_orphans(struct ubifs_info *c, int unclean, int read_only):
此函数在挂载时被调用,用于擦除前一个会话中的孤儿。如果没有干净地卸载UBIFS,
那么作为孤儿记录的索引节点将被删除
int kill_orphans(struct ubifs_info *c):
如果需要进行恢复,则必须从索引中删除前一个会话期间记录的孤儿索引节点(该会话以不干净的卸载结束)。
这是通过更新TNC来完成的,但是由于索引直到下一次提交才会更新,
所以记录孤儿信息的leb直到下一次提交才会被擦除。
int do_kill_orphans(struct ubifs_info *c, struct ubifs_scan_leb *sleb,
unsigned long long *last_cmt_no, int *outofdate, int *last_flagged):
它遍历LEB中的每个孤儿节点,对于记录的每个inode编号,从TNC中删除该inode的所有键。
四、涉及到ubif文件挂载的指令
1、挂载ubi文件系统
(1)ubiattach /dev/ubi_ctrl –m 10 –O 2048
UBI控制设备默认为/dev/ubi_ctrl(如果没有提供)。
ubiattach: 将MTD设备(描述原始flash)连接到UBI并创建相应的UBI设备
-m: 要附加的MTD设备号(替代方法,如字符设备节点不存在)
-O: VID头偏移量(除非你真的知道你在做什么,否则不要指定这个,默认应该是最优的)
(2) mount -t ubifs ubi0_0 /mnt/
将ubi文件系统挂载到 /mnt 目录下
2、挂载ubi文件系统开启文件系统检测功能
echo 1 > /sys/kernel/debug/ubifs/chk_fs
更多ubi功能介绍请参考链接:http://www.linux-mtd.infradead.org/faq/ubifs.html