| With the rapid development and wide application of information technology,a large number of unstructured texts are scattered in every corner of the Internet.How to quickly and accurately extract useful information from massive unstructured texts has become a research hotspot and difficulty in the field of natural language processing.Therefore,the paper focuses on the research of extraction technology about entity relationship in information extraction task.It aims at identifying the semantic relationship between entities from unstructured text.The main research work and innovations of the paper are as follows:1.Aiming at the problem that the existing extraction model of entity relationship does not make full use of entity information and relationship information before extracting entities.The paper proposes an extraction model about entity relationship(HGAT-RE)based on heterogeneous attention graph.Firstly,BERT is used as the word embedding layer,so that the model can better understand the semantic information of sentences;secondly,in order to get the feature representation suitable for the relationship extraction task,the paper models the entity attributes,relationship categories and words as nodes on the graph,and uses the attention mechanism to fuse the features of nodes;finally,the triples are extracted by cascading pointer labeling.Experimental results show that the F1 value of HGAT-RE model is improved by 2.7%and 1.2% respectively compared with the baseline model on NYT and Web NLG data sets.And it has better performance compared with other improved models.2.Aiming at the problem of exposure deviation in the pipeline model of entity relation extraction,the paper proposes a joint extraction model of entity relation(GPRE)based on global pointer marking.Firstly,BERT is used as the word embedding layer to obtain better contextual semantic information;then,the joint coding layer is proposed as the feature extraction layer of the model.It is used to learn the deep features of the text and enhance the information interaction between subtasks;finally,the triples are extracted by using the marking method of nested global pointers to solve the problems of simultaneous extraction of multiple relationships and entity nesting.Experimental results show that the F1 value of GP-RE on NYT and Web NLG data sets is improved by 2.4% and 3.1% compared with the baseline model,respectively.And it has strong performance compared with the mainstream model.3.The above research results are applied to the data set of Chinese judicial theft judgment documents to extract the entity relations in the above documents and establish a knowledge map of Chinese judicial theft judgment documents.Therefore,it can promote the practical application of the method in related fields. |