Font Size: a A A

Research Of Joint Extraction Of Entities And Relations Based On Pre-trained Model

Posted on:2022-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:D F ZhangFull Text:PDF
GTID:2518306572477804Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The development of Internet technology has gradually changed the way people obtain information.How to obtain key information from unstructured text information to construct knowledge graph is facing great challenges.As the basic task of knowledge graph,the purpose of named entity recognition and relation extraction is to extract the entities and relations between entities from unstructured text,so as to provide important support for semantic retrieval,knowledge base question answering,logical reasoning and other downstream tasks.According to the research of the joint extraction task of entities and relations,it is found that the existing joint extraction model has problems of error propagation and information redundancy,which makes it effectively difficult to extract all the triples in the sentence,especially the overlapping triples.To solve these problems mentioned above,from the perspective of relation,this thesis summarizes a novel joint extraction paradigm,that is,entity recognition based on relation.First of all,based on the assumption that "relations should have different representations based on different sentence contexts",this thesis proposes a novel and efficient model input form to learn relation representation according to sentence context.Then,based on the assumption that "the representation of the relations involving overlapping triples in a sentence is more similar than that of other relations in the same sentence”,this thesis proposes a relation contrastive pre-trained model(Relation Contrastive BERT,RCBERT),which uses contrastive learning method to train and further learn the differences between relation representations.Finally,this thesis uses RCBERT as the encoder,and proposes a multi-level attention joint extraction model(Multi-Level Attention Model,MLA),based on the two assumptions: "the relations involving overlapping triples in the sentence,there is a correlation between them" and "under different relations,the words in the sentence should have different representations.",which can effectively extract the overlapping triples by focusing on the correlation between sentences and relations,relations and relations,relations and words.In order to evaluate the effectiveness of the model proposed in this thesis,for the RCBERT model,this thesis measures the similarity of the relation representations it has learned,which proves that all the relation representations it learns are different.For the MLA model,this thesis conducts a comparative experiment and an extended experiment on the NYT and Web NLG datasets,and proves that it can extract overlapping triples effectively.Finally,this thesis also conducted ablation experiments on the MLA model to verify the importance of each component to the model.
Keywords/Search Tags:Named Entity Recognition, Relation Extraction, Pre-trained Model, Attention
PDF Full Text Request
Related items