Font Size: a A A

Research On Textual Relation Extraction In Complex Environment

Posted on:2022-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:X G DingFull Text:PDF
GTID:2518306563972949Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the explosively growth of the amount of text data on the Internet,it has become increasingly important to automatically extract enriched structured knowledge from massive text data.Relation extraction is one of the key technologies of information extraction,and it plays a very important role in many downstream tasks of natural language processing.The current research work is mainly focused on the extraction of relations in an ideal environment,but there are many complex problems in the actual environment.Firstly,most of the current laboratory environment will give entity information and extract the relationship in the case of a given entity.However,due to the complexity of entity annotation,there is often an environment where entity information is missing;Secondly,the current research mostly focuses on sentence-level relationship extraction,but the actual document-level text environment is more common,which has a large number of cross-sentence implicit relations;Thirdly,the complex environment does not exist alone,and the problem of entity information missing will also appear in the document-level environment,which is more complex and difficult.In view of the abovementioned three complex environments,this paper will work as follows:(1)Aiming at the lack of entity information at the sentence level,A sentence-level joint extraction of entities and relations method based on translation mechanism is proposed.Aiming at the problem of error transmission and ignoring the interactive features between subtasks in the previous pipeline model,our model regards the text as the source language,and the entity pairs under different relations as the target language,and realizes the translation between the two through the Encoder-Decoder framework based on attention mechanism,unifying the two subtasks in the same framework to solve the error transfer problem,And fully consider the interaction between subtasks.There are also interactive features between different relations.In this paper,different relations share model parameters,share underlying features,and learn the shared interactive information between different relations.Compared with the same period model on the NYT public data set,we prove the effectiveness of this model.(2)Aiming at the document-level environment,a document-level multi-level relation extraction method based on heterogeneous graph network is proposed.Most of the existing relation extraction methods are based on sentence-level environments and cannot be applied to document-level environments.Constructs a graph network based on the prior knowledge of co-occurrence relationship,co-referential relationship,semantic dependency relationship of entities,which reduces the noise in the document-level text,shortens the distance between entities,and provides a reasoning structure for the model.The differentiation of word level,mention level and entity level can bring more finegrained information to the model.Entity mention aggregates fine-grained word information with context,and entities focus on important entity mentions through the attention mechanism.Compared with the same period model on Doc RED dataset,the effectiveness of our model is proved in this paper.(3)Aiming at the document-level environment with missing entity information,a document-level entity-relation joint extraction method based on multi task learning is proposed.The complex environment does not exist alone,we integrate the two complex environments and proposes the document-level environmental problem of the lack of entity information.The subtasks of entity recognition,entity mention clustering,and relation extraction included in this problem are unified under the same framework through multi-task learning,and the document-level entity-relation joint extraction is realized in an end-to-end manner.The proposed mention clustering module based on cosine similarity can effectively cluster the same entity with different forms of entity mentions.The experimental results show that our model is effective in this new complex environment.
Keywords/Search Tags:Relation extraction, Joint extraction, Document-level, Translation mechanism, Heterogeneous graph network, Multi-Task
PDF Full Text Request
Related items