中国民航大学学报 ›› 2023, Vol. 41 ›› Issue (2): 47-53.

• 民用航空 • 上一篇    下一篇

基于 BERT-CNN 编码特征融合的实体关系联合抽取方法

丁建立,苏 伟   

  1. (中国民航大学计算机科学与技术学院, 天津 300300)
  • 收稿日期:2021-04-08 修回日期:2021-05-13 出版日期:2023-10-28 发布日期:2023-10-28
  • 作者简介:丁建立(1963—),男,河南洛阳人,教授,博士,研究方向为智能仿生算法及民航应用.
  • 基金资助:
    国家自然科学基金民航联合研究基金项目(U1833114)

Joint extraction method of entity relationship based on coding feature fusion of BERT-CNN

DING Jianli SU Wei   

  1. (School of Computer Science and Technology, CAUC, Tianjin 300300, China)
  • Received:2021-04-08 Revised:2021-05-13 Online:2023-10-28 Published:2023-10-28

摘要: 针对现有实体关系抽取模型结构复杂且抽取效果欠佳的问题,提出基于预训练的BERT(bidirectionalen鄄coderrepresentationfromtransformers)与CNN(convolutionalneuralnetwork)编码特征融合的实体关系联合抽取方法。首先,基于BERT-CNN编码的句子向量预测主语的首尾位置;其次,将预测的首尾位置索引句子中的特征向量作为预测主语的首尾向量,再将预测的主语首尾向量采用乘积方式进行特征融合得到主语向量;然后,将主语向量与句子向量以乘积方式融合得到新的句子编码向量,进而指导不同关系下宾语首尾位置的预测,得到实体关系三元组。为了验证模型效果,将本模型与其他类似算法模型在NYT与WebNLG公开数据集上进行对比实验,其准确率、召回率均优于对比模型且F1值分别达到92.75%与93.19%。

关键词: BERT, CNN, 特征融合, 二分类, 实体关系联合抽取, 实体关系三元组

Abstract: In view of the complex structure of the existing entity relationship extraction model and the poor extraction effect, a joint extraction method of entity relationship based on pre -training coding feature fusion of bidirectional encoder representation from transformers (BERT) and convolutional neural network (CNN) is proposed. First, the position of the beginning and end of the subject is predicted based on the sentence vector encoded by BERTCNN. Secondly, the feature vector of the predicted first and last position index sentence is used as the head and tail vector of the predicted subject, and then the head and tail vectors of the predicted subject are feature-fused to obtain the subject vectors based on the product method. Then, the subject vector and the sentence vector are fused by the product method to obtain a new sentence encoding vector, which guides the prediction of the head and tail positions of the object under different relationships and obtains entity relationship triplet. In order to verify the effect of the proposed model, this model and other similar algorithms are tested with the public data sets of NYT and WebNLG. The accuracy and recall rate of this model are better than those of the compared model, and the F1 values reach to 92.75% and 93.19%, respectively

Key words: font-size:15.04px, ">bidirectional encoder representation from transformers(BERT), convolutional neural network(CNN), feature fusion, binary classification, joint extraction of entity relationship, entity relationship triplet

中图分类号: