Relationship extraction has always been an important research direction in the field of natural language processing.Application fields such as robot conversation,text search and knowledge base question and answer often involve some complex long text segments.The single-sentence relationship extraction model cannot effectively obtain the entity relationships of large-scale corpus in practical application scenarios.Therefore,with the rapid development of the application of natural language processing,document-level relation extraction has attracted the focus of academic research.Document-level relationship extraction aims to extract the relationship between two given entities in multiple statements,construct graph structure and use graph convolutional network to model complex relationships in documents.Aiming at the problems of using heterogeneous graph rules to model by graph convolution,such as missing adjacency dependency information,long-distance dependency in constructing document-level dependency trees,inability to effectively obtain complex logical relationships in documents by establishing inference paths,and lack of Chinese document data sets,this paper conducts the following research:(1)In view of the existing models,which generally use heterogeneous graph modeling,and ignore the integration of entity and other word-level features and long-distance dependence,a two-channel graph convolutional network method is proposed.Firstly,dependency characteristics and semantic information are learned from document-level dependency graphs and heterogeneous graphs by using graph convolutional networks.Then,weighted updates are made to nodes of all positions of dependency characteristics using location attention.Finally,the two features are dynamically fused to generate multiple interactive entity feature representations.(2)Aiming at the problem that the inference path relationship constructed by the entities in the document cannot effectively represent the complex relationship between entities in the document,which leads to the lack of interaction of the feature information between entities,a multi-head enhanced path graph convolution attention method is proposed.First,three paths are built through path labels,dependency trees,and heuristic path rules.Then,the path label is embedded into the three inference paths modeled by the convolutional model in the following figure of heterogeneous graph rules,and an inference path is obtained after inference analysis.Finally,according to the importance of different syntactic information,the inference path,the shortest dependency path and the heuristic path are assigned weights and input as the head of attention.(3)In view of the lack of Chinese data set in the existing document-level relationship extraction,1,500 Chinese documents were collected on Weibo,Zhihu,Baidu and other platforms through crawler and manual annotation,including 500 training sets,500 test sets,500 development sets and two relationship types.In conclusion,experimental results on public data sets and constructed Chinese data sets show that the proposed two-channel graph convolutional network method can effectively integrate dependency features and reduce long-distance dependence,and the proposed multihead enhanced path graph convolutional attention method can effectively enhance the problem of lack of complex semantic information in inference paths. |