Font Size: a A A

Research Of End-to-end Joint Entity And Relation Extraction

Posted on:2022-12-19Degree:MasterType:Thesis
Country:ChinaCandidate:D ChengFull Text:PDF
GTID:2518306779464054Subject:Journalism and Media
Abstract/Summary:PDF Full Text Request
As a kind of structured format,knowledge graph has been widely used in natural language processing applications such as search engines and question answering systems.In order to add the increasing world knowledge to the knowledge graph,researchers in the field of natural language processing are committed to exploring ways to efficiently and automatically acquire world knowledge,that is,relation extraction technology.Relation extraction is a classic task,and has been well studied in the past 20 years.The traditional relation extraction method is the Pipeline method which first extract named entities,and then classify the relation types between candidate entity pairs.The Pipeline method ignores the close connection and interaction between the named entity recognition task and the relation classification task,and the error of the entity recognition task will be passed to the downstream relation classification task.The end-to-end joint entity and relation extraction models were proposed to solve the above two problems.In the end-to-end model,the named entity recognition task and the relation classification task share the same encoder and parameter,which connects these two tasks and eases the error transmission to a certain extent.Recently,the end-to-end entity and relation extraction model is focus on solving the relation overlapping problem,and almost reaches the limit of the domain in terms of model performance.However,these end-to-end joint entity and relationship extraction models still have some problems:1)The mainstream text encoder BERT limits the maximum length of input text to 512,2)The complexity of the self-attention mechanism and relationship classification is proportional to the square of the maximum sentence length.3)There is almost no research working on the Chinese relation extraction data set.Based on the above problems,this paper will 1)propose an improved end-to-end entity and relation extraction model,focusing on improving the embedding representation and reducing the complexity of the model.The performance of the model is verified on the English public data set NYT.2)propose an entity and relation extraction model suitable for long texts.After splitting a long sentence text into several shorter sub-sentences,using a cross-sentence relation extraction algorithm,it is possible to extract the relation triplets where the subject and the object come from different subsentences,thus solving the problem that long-distance relation triplets cannot be extracted and the high complexity of the model for long texts.We verify its effectiveness and performance on the Chinese data set Du IE2.0.3)propose an extraction model of complex relations,which need to extract more information but not limited to the form of triplets.By decomposing a complex relationship extraction problem into multiple simple relationship triples,the model can extract more information during one extraction process.Its effectiveness is also verified on the complex relation data set of Du IE2.0.
Keywords/Search Tags:end-to-end entity and relation extraction, entity and relation extraction for long text, complex relation extraction
PDF Full Text Request
Related items