比赛地址:医疗诊疗对话意图识别
V1:
方案:BERT
预训练模型:bert-base-chinese
分数:0.8043
Val P: 0.7621, Val R: 76.7676%, Val F1: 0.7619, Val Acc: 81.5204%
V2:
分数:0.8158
排名:长期赛:193(本次)/1561
方案:BERT+CNN
预训练模型:bert-base-chinese
训练结果:
Val P: 0.758, Val R: 76.6730%, Val F1: 0.7595, Val Acc: 81.7135%
后期优化思路:
1、使用行业模型微调
2、尝试bert+bilstm+crf
3、词嵌入时单独增加角色信息编码
4、多模型融合
运行脚本:
python run_bert.py
源码下载:
比赛地址:医疗诊疗对话意图识别挑战赛BERT/BERT+CNN资源-CSDN文库
模型:配置bert
import torch
import torch.nn as nn
# from pytorch_pretrained_bert import BertModel, BertTokenizer
from transformers import BertModel, BertTokenizer, BertConfig
from transformers import AutoModel, AutoTokenizer, AutoConfig
import os
class Config(object):
"""配置参数"""
def __init__(self, dataset, local=False):
self.local = local
self.date='20241106'
self.model_name = 'bert_'+self.date
self.train_path = dataset + '/data/train.txt' # 训练集
self.dev_path = dataset + '/data/dev.txt' # 验证集
self.test_path = dataset +