[BIBM 2023]DILM-ICD: A Deep Iterative Learning Model for Automatic ICD Coding

论文网址:DILM-ICD: A Deep Iterative Learning Model for Automatic ICD Coding | IEEE Conference Publication | IEEE Xplore

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用

目录

1. 心得

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Related Work

2.4. Methodology

2.4.1. Problem Formulation

2.4.2. DILM-ICD Architecture

2.4.3. EMR Processing Module

2.4.4. Attention Module

2.4.5. Iteration Module

2.5. Experiments

2.5.1. Datasets and Evaluation Metrics

2.5.2. Baselines

2.5.3. Implementation and Hyper-parameter Tuning

2.6. Experimental Results

2.6.1. Main Results

2.6.2. Effectiveness of the Iteration Module

2.7. Conclutions

1. 心得

(1)ICD类别真的太多了

2. 论文逐段精读

2.1. Abstract

        ①Challenge of ICD classification: large label space

        ②Thus, they proposed Deep Iterative Learning Model (DILM-ICD) to solve this problem

2.2. Introduction

        ①Diagnosis code: about 13,500 in ICD-9-CM and 70,000 in ICD-10-CM.

        ②Challenges: label imbalance and large categories

2.3. Related Work

        ①Introduced traditional machine learning methods and deep learning methods

2.4. Methodology

2.4.1. Problem Formulation

        ①Clinical word sequence: w=\{w_{1},w_{2},...,w_{n}\}

        ②ICD diagnosis: \{y_1^{\prime},y_2^{\prime},\ldots,y_L^{\prime}\} where L is the label space

        ③BCE loss:

Loss\left(y^{\prime},y\right)=\sum_{i=1}^Ly_i\log y_i^{\prime}+(1-y_i)\log\left(1-y_i^{\prime}\right)

where y_{i}\in\{0,1\} denotes real label

2.4.2. DILM-ICD Architecture

        ①The architecture of DILM-ICD:

2.4.3. EMR Processing Module

        ①Pretraining and initialize the embedding by the continuous bag-of-words (CBOW) model to get E=\{e_{1},e_{2},...,e_{n}\}

        ②Employ layer norm on E

        ③Capturing context by BiLSTM to get the representation H\in\mathbb{R}^{n\times d_h}

2.4.4. Attention Module

        ①“为了更好地学习标签的表示,我们不是随机初始化标签权重矩阵,而是使用ICD代码的标题信息来计算每个ICD代码的标签向量。具体来说,我们将 ICD 代码本身与其相应的 ICD 标题相结合,形成全面的 ICD 描述。此描述利用了 ICD 代码中存在的分层信息以及 ICD 标题中包含的疾病描述。”(这标题长啥样啊)

        ②Pretrained ICD description by skip-gram and the encoded it as a normalized way:

t_i=\frac{1}{l_d}\sum_{j=1}^{l_d}e_{ij}

where l_d is the length of ICD description, e_{ij} is the word vector of ICD description

        ③ICD description matrix: T=\{t_1,t_2,\ldots,t_L\}\in R^{L\times d_h}

        ④Cross attention between clinical records and ICD description:

\begin{gathered} Q_i=TW_i^Q,K_i=HW_i^K=V_i \\ A_i=softmax\left(\frac{Q_i(K_i)^T}{\sqrt{d_k}}\right)V_i \\ A=concat\left(A_1,A_2,\ldots,A_m\right) \end{gathered}

        ⑤Linear layer and residual block are to get the output:

Z=LayerNorm\left(T+AW^A\right)\\Z=LayerNorm(Z+drop(Z))

2.4.5. Iteration Module

        ①Iteration framework:

s_j=F(H,Z,s_{j-1})

where s_j denotes the j-th iteration and F\left ( \cdot \right ) denotes model

        ②The initial score s_{1}=FFNN\left(Z\right)

        ③The predicted label scores:

c_j=concat\left(h,s_{j-1}\right)

where h=GMP\left ( H \right ) and h\in\mathbb{R}^{1\times d_{h}}

        ④The text feature vector:

p_j=f\left(\left(c_jW^{fc1}\right)W^{fc2}\right)

where f\left ( \cdot \right ) denotes the ReLU

        ⑤The iteration:

Z_j= \begin{array} {c}LayerNorm(Z_{(j-1)}p_j) \end{array}

s_j=FFNN\left(Z_j\right)+s_{(j-1)}

2.5. Experiments

2.5.1. Datasets and Evaluation Metrics

        ①Dataset: MIMIC-III

        ②Samples: 52,722

        ③Subsets of MIMIC III: MIMIC-III-full with 8,929 unique ICD code and MIMIC-III-50 with 50 common code

2.5.2. Baselines

        ①就不例举了,比较表里面有

2.5.3. Implementation and Hyper-parameter Tuning

        ①Learning rate: 0.001 with AdamW optimizer

        ②Batch size: 8

        ③Max text length: 4,000

        ④The dimension of embedding d_e=100

        ⑤The hidden dim of BiLSTM: 512 in MIMIC-III-full and 256 in MIMIC-50

        ⑥Attention head: 1 in full and 4 in 50

2.6. Experimental Results

2.6.1. Main Results

        ①Comparison table on MIMIC-III-FULL:

        ②Comparison table on MIMIC-III-FULL:

2.6.2. Effectiveness of the Iteration Module

        ①Performance with iteration:

2.7. Conclutions

        ~

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值