Font Size: a A A

Distant Supervision Relation Extraction Based On Multi Head Self Attention And Entity Feature

Posted on:2021-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhuFull Text:PDF
GTID:2428330614971301Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Relation extraction is an important basic task in the field of natural language processing.Its purpose is to determine the semantic relation between the entity pairs in the text sentence,and it plays an important role in the application fields such as knowledge graph and intelligent question answering.The traditional supervised relation extraction completely relies on manual annotation to obtain training corpus,which takes a lot of time and consumes a lot of human resources.Therefore,the distant supervision relation extraction that obtains a large amount of corpus through automatic annotation has gradually become a hot topic in relation extraction tasks.The premise of distant supervision is that if there is a certain relation between an entity pair in the knowledge base,then all text sentences containing this entity pair can express this relation.Based on this assumption,a large amount of training corpus can be constructed in a short time through distant supervision,but we cannot avoid the problem of incorrect labeling caused by distant supervision,resulting in a large amount of noise in the dataset.For example,there are a lot of noise words in the sentences that are not related to the relation,which affects the accuracy of the neural network model to obtain the semantic representation of the sentence.In addition,existing relation extraction models that use sentence-level attention mechanisms are directly based on entity relation labels that contain a lot of noise,making it difficult to reasonably allocate the contribution of different sentences to the final relation prediction.Facing the above challenges,this paper proposes corresponding solutions.The main innovations and contributions are as follows:(1)We propose a relation extraction model based on multi-head self-attention mechanism.After extracting text features through the convolution operation,the multihead self-attention mechanism is used to reduce the negative effect of irrelevant noise words on sentence representation,which not only obtains better sentence semantic representation,but also avoids the influence of relation label noise on word-level attention distribution.(2)Based on the above relation extraction model,we further utilize entity feature information,and propose a relation extraction model which integrates the feature of entities.Based on the idea of the Trans E algorithm,when selecting different sentences to generate entity bag features,the implicit semantic representation of the entity after the bilinear transformation of the head and tail entities in the sentence is used as a basis,and the scaled dot-product attention mechanism is used to dynamically allocate the contribution of different sentences to the final relation prediction.Through this method,a large amount of entity relation label noise is filtered,which further alleviates the adverse effects caused by the noise data in distant supervision.Besides,in the input layer of relation extraction model,in addition to the commonly used word embedding and position embedding,we additionally add named entity embedding and core entity word embedding,which further enriches the input representation of sentences and helps the model obtain more effective features.This paper focuses on the task of distant supervision relation extraction,proposes the above research methods in terms of reducing the effect of distant supervision noise,and conducts detailed and comprehensive experiments.The experimental results on the NYTFreebase dataset show that the method proposed in this paper has advantages over the baseline models,which further improves the performance of relation extraction.In terms of AUC value,the method in this paper reaches 38.2%.Compared with the PCNNs+ATT model,it is improved by 4.1%,and compared with the BGWA model,it is improved by 2.2%,which verifies the effectiveness of the proposed method.
Keywords/Search Tags:Relation Extraction, Distant Supervision, Multi-head Self-attention, Entity Feature
PDF Full Text Request
Related items