[BIBM 2023]DILM-ICD: A Deep Iterative Learning Model for Automatic ICD Coding

最新推荐文章于 2025-09-13 23:06:41 发布

夏莉莉iy

最新推荐文章于 2025-09-13 23:06:41 发布

阅读量914

点赞数 12

CC 4.0 BY-SA版权

分类专栏：论文精读文章标签：人工智能深度学习机器学习神经网络计算机视觉 ICD 多分类

本文链接：https://blog.csdn.net/Sherlily/article/details/151365369

论文精读专栏收录该内容

208 篇文章

订阅专栏

论文网址：DILM-ICD: A Deep Iterative Learning Model for Automatic ICD Coding | IEEE Conference Publication | IEEE Xplore

英文是纯手打的！论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误，若有发现欢迎评论指正！文章偏向于笔记，谨慎食用

2.4.1. Problem Formulation

2.4.2. DILM-ICD Architecture

2.4.3. EMR Processing Module

2.4.4. Attention Module

2.4.5. Iteration Module

2.5. Experiments

2.5.1. Datasets and Evaluation Metrics

2.5.2. Baselines

2.5.3. Implementation and Hyper-parameter Tuning

2.6. Experimental Results

2.6.1. Main Results

2.6.2. Effectiveness of the Iteration Module

2.7. Conclutions

1. 心得

（1）ICD类别真的太多了

2. 论文逐段精读

2.1. Abstract

①Challenge of ICD classification: large label space

②Thus, they proposed Deep Iterative Learning Model (DILM-ICD) to solve this problem

2.2. Introduction

①Diagnosis code: about 13,500 in ICD-9-CM and 70,000 in ICD-10-CM.

②Challenges: label imbalance and large categories

2.3. Related Work

①Introduced traditional machine learning methods and deep learning methods

2.4. Methodology

2.4.1. Problem Formulation

①Clinical word sequence: $w=\{w_{1},w_{2},...,w_{n}\}$

②ICD diagnosis: $\{y_1^{\prime},y_2^{\prime},\ldots,y_L^{\prime}\}$ where $L$ is the label space

③BCE loss:

$Loss\left(y^{\prime},y\right)=\sum_{i=1}^Ly_i\log y_i^{\prime}+(1-y_i)\log\left(1-y_i^{\prime}\right)$

where $y_{i}\in\{0,1\}$ denotes real label

2.4.2. DILM-ICD Architecture

①The architecture of DILM-ICD:

2.4.3. EMR Processing Module

①Pretraining and initialize the embedding by the continuous bag-of-words (CBOW) model to get $E=\{e_{1},e_{2},...,e_{n}\}$

②Employ layer norm on $E$

③Capturing context by BiLSTM to get the representation $H\in\mathbb{R}^{n\times d_h}$

2.4.4. Attention Module

①“为了更好地学习标签的表示，我们不是随机初始化标签权重矩阵，而是使用ICD代码的标题信息来计算每个ICD代码的标签向量。具体来说，我们将 ICD 代码本身与其相应的 ICD 标题相结合，形成全面的 ICD 描述。此描述利用了 ICD 代码中存在的分层信息以及 ICD 标题中包含的疾病描述。”（这标题长啥样啊）

②Pretrained ICD description by skip-gram and the encoded it as a normalized way:

$t_i=\frac{1}{l_d}\sum_{j=1}^{l_d}e_{ij}$

where $l_d$ is the length of ICD description, $e_{ij}$ is the word vector of ICD description

③ICD description matrix: $T=\{t_1,t_2,\ldots,t_L\}\in R^{L\times d_h}$

④Cross attention between clinical records and ICD description:

$\begin{gathered} Q_i=TW_i^Q,K_i=HW_i^K=V_i \\ A_i=softmax\left(\frac{Q_i(K_i)^T}{\sqrt{d_k}}\right)V_i \\ A=concat\left(A_1,A_2,\ldots,A_m\right) \end{gathered}$