Font Size: a A A

Research On Key Techniques Of Relation Extraction For Text Data

Posted on:2022-11-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:H L WangFull Text:PDF
GTID:1488306764960089Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Relation extraction is a task of automatically detecting and identifying predefined se-mantic relationships between identified entities in text.As a core and basic technology of knowledge acquisition in knowledge engineering,relation extraction endows artificial in-telligence with strong ability of knowledge understanding.Massive text data,as the carrier of human knowledge,is rapidly submerged in the tide of information with the explosive growth of information.Mining knowledge hidden in these texts,is not only the theoretical demand of natural language processing but also the practical demand of human civilization inheritance.Natural language processing based on deep learning methods has made great progress in relation extraction field,effectively promoting knowledge discovering in texts of various granularity.However,some problems in relation extraction still need solving in the practical research process.First,entity relation's directional semantic missing,makes the relation distinguish lack utilization of semantic features contained in the text.Second,document-level entity relation evidence is scattered in the text,resulting in the semantics supporting entity relation can hardly be perceived.Third,the crucial semantic between entity pairs in a long text cannot be mined,which needs to construct a long-range depen-dency along the text.Therefore,the dissertation focuses on directional semantic missing,hidden evidence,long-range semantic dependency in sentence-level and document-level relation extraction from the perspective of syntactic and discourse linguistic knowledge.The main contributions of this dissertation are as follows:(1)To alleviate directional semantic missing in entity relation,this dissertation pro-poses direction-sensitive relation extraction based on sentence-level syntactic structure.The conventional sentence-level relation extraction ignores the representation of the di-rectional semantic of entity pairs from text,which makes it difficult to further improve the classification performance of the relation with obvious physical direction.This disserta-tion constructs a bidirectional shortest dependency path structure with directional differ-ences from the structure information of dependency syntactic tree.Based on the direc-tional shortest dependency path,a parallel attention mechanism is constructed by using the character-level features of text and dependency path to capture semantic words and directional words in a text.Meanwhile,to alleviate the problem of over-trimming depen-dency paths,a new text trimming strategy is proposed,which not only reduces the text input but also improves the model performance.Finally,the experimental results on two data sets show that the directional semantic perception is better than other methods.(2)To solve the issue of hidden evidence of entity relation,this dissertation proposes document-level relation extraction based on discourse relation between text fragments.Faced with more complex texts and more diverse semantics,document-level relation ex-traction requires the model to have more powerful screening and evidence-based reason-ing ability for hidden evidence,while traditional sequence-based methods cannot obtain discrete evidence scattered in long texts.This dissertation utilizes the discourse relation contained in fragments from a document to construct the document graph by which seman-tic association is created between entity pairs.Meanwhile,these discourse relations could be used to screen appropriate and hidden evidence.Experimental results show that the model has good performance,and constructs a multi-layer evidence screening mechanism as well as a clear evidence reasoning process.(3)To construct long-range semantic dependency between entity pairs,this disser-tation proposes document-level relation extraction combining syntactic relation and dis-course relation.Conventional sentence-level relation extraction obtains the shortest se-mantic dependence in a sentence by finding the shortest dependency path in the depen-dency syntactic tree,while document-level extraction is difficult to find the effective se-mantic dependence between entities due to the complexity of the text.This dissertation combines syntactic structure and discourse relation to construct a word-level document graph and uses the Steiner tree algorithm to extract the minimum spanning tree from the document graph to form a keyword path,so as to obtain the semantic dependencies most relevant to entity pairs.At the same time,double-layer attention weight values are con-structed at both text and graph levels to improve the semantic characteristics of keywords.And then,a back-deployment method is used to improve the performance of the model during training.Experimental results show that this work enjoys a competitive perfor-mance among the current open source and graph-based models and constructs effective semantic dependency paths in document graphs.
Keywords/Search Tags:Deep Neural Networks, Natural Language Processing, Sentence-level Relation Extraction, Document-level Relation Extraction
PDF Full Text Request
Related items