Font Size: a A A

Distant Supervision Relation Extraction Model Based On Bag Reconstruction

Posted on:2022-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:S J LiFull Text:PDF
GTID:2518306758491634Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
In recent years,the Internet is growing at a rapid race,which has led to a large amount of text data generated on the Internet.And how to effectively use the knowledge contained in this text data has become a challenge.Knowledge Extraction can extract information from text data which are from different sources on the Internet,and then form structed data which can be stored in the Knowledge Graph.In the form of graphs,Knowledge Graph shows entities in the objective world with their relations,so that people can use them efficiently.As a subtask of Knowledge Extraction,Relation Extraction can extract relations between entity pairs to get triplet information in the text.The proposal of Distant Supervision Relation Extraction realizes the automatic building of Relation Extraction dataset,which greatly saves the cost,but brings a lot of noisy data,which becomes a core problem of Distant Supervision Relation Extraction.In order to reduce the impact of noisy sentences,the traditional Distant Supervision Relation Extraction methods get sentences which share the same relation label into a sentence bag,and then use sentence-level Attention Mechanism.However,in the case of the NYT dataset,there are a large number of sentence bags containing only negative samples.Thus,we proposed a Distant Supervision Relation Extraction method based on Bag Reconstruction.The basic idea of our proposed model is as follows.We use a Full-Label based method to obtain the sentence bag representation,and then predict the relation of the sentence bag.Based on the prediction results,the sentence bags are regrouped to reduce the impact of bag-level noisy samples.First,in the stage of processing input data,our method obtains a higher quality sentence representation with entity embedding information.Second,during the Full-Label based Pre-training stage,considering the intrinsic relationship between sentences and relation labels,which means that the negative samples in one sentence bag may be the positive samples of another relation label,this paper introduces a relation embedding matrix.Then Attention Mechanism is applied to obtain representations of sentence bags for all relation labels according to the relation embedding matrix.Furthermore,the classifier is used to predict the relation of the sentence bags.Besides,in order to solve the problem that a lot of sentence bags have only negative samples,in the Regrouping-based Training stage,the sentence bags sharing the same prediction result are divided into the same group.Considering that the representations of sentence bags sharing the same relation label are relatively close to each other,so Multi-Head Self-Attention Mechanism is used in this paper to fully consider the relationship between all sentence bags within one sentence-bag group,and based on this,generate a group-based representation.Finally,we get the representations into the classifier to predict their relations.For our proposed Distant Supervision Relation Extraction method based on Bag Reconstruction,the NYT dataset is used for our experiments.Compared with the baselines,our method performs better.Moreover,when the number of sentences in the sentence bag is reduced,our model obtains a better classification effect,which proves that our model has stable and reliable performance while using different scales of test dataset.Above all,the experimental results intuitively show that compared with other traditional methods,our proposed Bag-Reconstruction based method fully considers the relevance between sentences and all relation labels,and further reduces the impact of noisy data by predicting the relation and regrouping sentence bags.
Keywords/Search Tags:Distant Supervision Relation Extraction, Bag Reconstruction, Full Label, Multi-Head Self-Attention Mechanism
PDF Full Text Request
Related items