基于门控注意力单元的中文医学命名实体识别

    Chinese Medical Named Entity Recognition Based on Gated Attention Unit

    • 摘要: 医学命名实体识别任务是对电子病历中的医学实体进行自动识别和分类,对于下游任务例如信息检索、知识图谱等有着十分重要的作用。现有的方法忽略了实体间的依赖性,因此,本文提出了一种基于门控注意力单元的模型,首先利用预训练模型MC-BERT捕捉上下文语境信息,再利用交叉注意力和门控注意力单元提高实体查询和上下文语义之间的交互性,并提取实体间的依赖关系和关联性,最后,利用二分图的匹配算法,计算模型训练中的损失。本文在CMeEE、CMQNN和MSRA数据集上进行了实验,实验结果表明本文模型在3个数据集上的F1值分别达到了70.74%,96.92%和95.53%,优于其他相关模型,证明了本文模型在中文医学命名实体识别任务上的有效性。

       

      Abstract: The medical named entity recognition aims to automatically identify and classify medical entities in electronic medical records, which plays a very important role in downstream tasks such as information retrieval and knowledge graph. Existing methods usually ignore the dependencies between entities. To address this, this paper proposes a gated attention unit-based model for Chinese medical named entity recognition. First, the proposed model uses the pre-training model MC-BERT to capture contextual information. Then, it uses the cross-attention and gated attention unit to enhance the interaction between entity query and contextual semantics, and further extract the dependency and correlation between entities. Finally, the proposed model uses the matching algorithm of bipartite graph to calculate the loss. This paper conducted experiments on three datasets, including the CMeEE, CMQNN, and MSRA. The experimental results show that the F1 values of the proposed model on the three datasets are 70.74%, 96.92%, and 95.53%, respectively, which outperforms other related models, demonstrating the effectiveness of the proposed model in Chinese medical named entity recognition task.

       

    /

    返回文章
    返回