Font Size: a A A

Research On Relation Extraction Methods For Complex Text Scenes

Posted on:2022-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:H LiuFull Text:PDF
GTID:2518306740982789Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Relation extraction is an important step of knowledge extraction,which aims to discover the semantic relation between entities from text corpus or multimodality data,so as to provide knowledge triples for the construction of knowledge atlas,and directly affect the quality and application effect of knowledge graph.In recent years,great progress has been made in rela-tion extraction,but most of the existing research work focuses on intra-sentence relation and simple entity pair relation,while the research progress in document-level relation extraction and overlapping relation extraction is relatively slow and faces two main challenges:(1)In document-level relation extraction,how to accurately unify the feature representation of en-tity information and semantic information of multiple sentences still needs to be explored;(2)In overlapping relation extraction,the overlap of multiple triples is complex,and how to accu-rately identify different relation triples according to semantic information is still a difficult point.Cross-sentence and overlapping relations are very common in practical application scenarios,and how to extract relational facts from these complex scenarios is particularly important.Aim-ing at the above problems,the paper studies the document-level relation extraction and overlap relation extraction,respectively.The main contents are as follows:1.A multi-granularity relation extraction model,MGRE,is proposed.The model fully integrates the semantic information of entity level,sentence level and document level,so as to better characterize the semantic interaction between entities and various sen-tences in the document.Firstly,in the construction of entity-level semantic information,the traditional dependency path method only takes the entity pair information as part of the path sequence for feature extraction,but cannot distinguish the semantic differences between entities.The paper draws on the translation idea of TransE model,and uses the translation strategy to fuse the representation of head entity and tail entity obtained through the dependency path to obtain the association information between entity pairs.In the construction of sentence-level semantic information,the paper uses CNN network to extract semantic features from each sentence.For the multiple sentence-level feature vectors obtained by the sentence-level network layer,they are fused into document-level semantic features through the attention mechanism,and further fused with entity-level semantic information,completing the organic unification of document information and entity-pair information.The experimental results on the public data set show that the proposed method achieves better extraction performance on the document-level relation extraction task.2.A three-stage relation extraction model TSRE based on pointer annotation is pro-posed.The model mainly includes three stages: relation classification,header entity la-beling and tail entity labeling.Firstly,for the complex situation of triples overlapping in relation extraction,the relation classification stage can naturally divide the triples in the text into multiple simple small sets according to the relation category,which reduces the complexity of subsequent entity identification.Then,at the stage of head entity la-beling and tail entity labeling,the paper adopts the pointer labeling strategy,which can extract entities of any span through the head and tail pointer.Moreover,the three stages of the model are connected and progressive layer by layer,and the triple elements ex-tracted from each stage will be input into the network in the next stage as preconditions,which fully integrates the interaction information between entities and relations and well improves the performance of overlapping relation extraction.The experimental results show that the proposed method achieves the best extraction performance on both Du RED and ICRED datasets,and outperforms other existing models on the overlapping relation extraction task.
Keywords/Search Tags:Knowledge graph, Relation extraction, Syntactic dependency analysis, Pointer annotation
PDF Full Text Request
Related items