基于 BERT-CNN 编码特征融合的实体关系联合抽取方法

中国民航大学学报 ›› 2023, Vol. 41 ›› Issue (2): 47-53.

基于 BERT-CNN 编码特征融合的实体关系联合抽取方法

丁建立，苏伟

（中国民航大学计算机科学与技术学院，天津 300300）

收稿日期:2021-04-08 修回日期:2021-05-13 出版日期:2023-10-28 发布日期:2023-10-28
作者简介:丁建立（1963—），男，河南洛阳人，教授，博士，研究方向为智能仿生算法及民航应用.
基金资助:
国家自然科学基金民航联合研究基金项目（U1833114）

Joint extraction method of entity relationship based on coding feature fusion of BERT-CNN

DING Jianli SU Wei

(School of Computer Science and Technology, CAUC, Tianjin 300300, China)

Received:2021-04-08 Revised:2021-05-13 Online:2023-10-28 Published:2023-10-28

摘要/Abstract

摘要： 针对现有实体关系抽取模型结构复杂且抽取效果欠佳的问题，提出基于预训练的BERT（bidirectionalen鄄coderrepresentationfromtransformers）与CNN（convolutionalneuralnetwork）编码特征融合的实体关系联合抽取方法。首先，基于BERT-CNN编码的句子向量预测主语的首尾位置；其次，将预测的首尾位置索引句子中的特征向量作为预测主语的首尾向量，再将预测的主语首尾向量采用乘积方式进行特征融合得到主语向量；然后，将主语向量与句子向量以乘积方式融合得到新的句子编码向量，进而指导不同关系下宾语首尾位置的预测，得到实体关系三元组。为了验证模型效果，将本模型与其他类似算法模型在NYT与WebNLG公开数据集上进行对比实验，其准确率、召回率均优于对比模型且F1值分别达到92.75%与93.19%。

关键词: BERT, CNN, 特征融合, 二分类, 实体关系联合抽取, 实体关系三元组

Abstract: In view of the complex structure of the existing entity relationship extraction model and the poor extraction effect, a joint extraction method of entity relationship based on pre -training coding feature fusion of bidirectional encoder representation from transformers (BERT) and convolutional neural network (CNN) is proposed. First, the position of the beginning and end of the subject is predicted based on the sentence vector encoded by BERTCNN. Secondly, the feature vector of the predicted first and last position index sentence is used as the head and tail vector of the predicted subject, and then the head and tail vectors of the predicted subject are feature-fused to obtain the subject vectors based on the product method. Then, the subject vector and the sentence vector are fused by the product method to obtain a new sentence encoding vector, which guides the prediction of the head and tail positions of the object under different relationships and obtains entity relationship triplet. In order to verify the effect of the proposed model, this model and other similar algorithms are tested with the public data sets of NYT and WebNLG. The accuracy and recall rate of this model are better than those of the compared model, and the F1 values reach to 92.75% and 93.19%, respectively

Key words: font-size:15.04px, ">bidirectional encoder representation from transformers(BERT), convolutional neural network(CNN), feature fusion, binary classification, joint extraction of entity relationship, entity relationship triplet

中图分类号:

TP391

丁建立, 苏伟. 基于 BERT-CNN 编码特征融合的实体关系联合抽取方法[J]. 中国民航大学学报, 2023, 41(2): 47-53.

DING Jianli SU Wei. Joint extraction method of entity relationship based on coding feature fusion of BERT-CNN[J]. Journal of Civil Aviation University of China, 2023, 41(2): 47-53.

参考文献 9

[1]	吕金娜,邢春玉,李莉．基于多特征融合的细粒度视频人物关系抽取[J]．计算机科学，2021,48(4):117-122．
[2]	李建,靖富营,刘军．基于改进BERT算法的专利实体抽取研究:以石墨烯为例[J]．电子科技大学学报,2020,49(6):883-890．
[3]	冯建周,宋沙沙,王元卓,等．基于改进注意力机制的实体关系抽取方法[J]．电子学报，2019,47(8):1692-1700．
[4]	欧阳丹彤,肖君,叶育鑫．基于实体对弱约束的远监督关系抽取[J]．吉林大学学报（工学版），2019,49(3):912-919．
[5]	吕亿林,田宏韬,高建伟,等．结合百科知识与句子语义特征的关系抽取方法[J]．计算机科学，2020,47(S1):40-44,65．
[6]	宁尚明,滕飞,李天瑞．基于多通道自注意力机制的电子病历实体关系抽取[J]．计算机学报，2020,43(5):916-929．
[7]	闫雄,段跃兴,张泽华．采用自注意力机制和CNN融合的实体关系抽取[J]．计算机工程与科学，2020,42(11):2059-2066．
[8]	张志昌,周侗,张瑞芳,等．融合双向GRU与注意力机制的医疗实体关系识别[J]．计算机工程，2020,46(6):296-302．
[9]	张心怡,冯仕民,丁恩杰．面向煤矿的实体识别与关系抽取模型[J]．计算机应用，2020,40(8):2182-2188．

	[10] 曹明宇, 杨志豪, 罗凌, 等．基于神经网络的药物实体与关系联合抽取[J]．计算机研究与发展, 2019, 56(7): 1432-1440．


	[11] 丁相国, 桑基韬．基于关系自适应解码的实体关系联合抽取[J]．计算机应用, 2021, 41(1): 29-35．


	[12] WEI Z P, SU J L, WANG Y, et al． A novel cascade binary tagging framework for relational triple extraction[C]//58th Annual Meeting of the As－ sociation for Computational Linguistics, July, 2020, Online. Strouds－ burg, PA, USA: Association for Computational Linguistics, 2020: 14761488．


	[13] DEVLIN J, CHANG M W, LEE K, et al BERT: pre-training of deep bidirectional transformers for language understanding [C]//The North American Chapter of the Association for Computational Linguistics: Human Language Technologies, May 24, 2019, Minneapolis, USA: the NAACL-HLT press, 2019: 4171-4186.


	[14] ZENG X R, HE S Z, ZENG D J, et al． Learning the extraction order of multiple relational facts in a sentence with reinforcement learning[C]// 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Pro － cessing(EMNLP-IJCNLP), November, 2019, Hong Kong, China． Strouds－ burg, PA, USA: Association for Computational Linguistics, 2019: 367377.


	[15] FU T J, LI P H, MA W Y． GraphRel: modeling text as relational graphs for joint entity and relation extraction[C]//57th Annual Meeting of the Association for Computational Linguistics, July, 2019, Florence, Italy． Stroudsburg, PA, USA: Association for Computational Linguistics, 2019: 1409-1418.


	[16] WANG Y C, YU B W, ZHANG Y Y, et al． TPLinker: single-stage joint extraction of entities and relations through token pair linking [C]//28th International Conference on Computational Linguistics, December, 2020, Barcelona, Spain (Online)． Stroudsburg, PA, USA: International Committee on Computational Linguistics, 2020: 1572-1582.