Font Size: a A A

Research On Relation Extraction Based On Distant Supervision

Posted on:2021-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:H Y YuanFull Text:PDF
GTID:2518306107989759Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As an important research topic in the field of information extraction,relation extraction is also one of the tasks of natural language processing.To avoid the labor and time consumption caused by massive data labeling,distant supervision for relation extraction is proposed to extract factual relation from a very large corpus.Distantly-supervised relation extraction assumes that if there is a relation between two entities in the knowledge base,then all sentences containing these two entities express this relation.This assumption will inevitably lead to wrong labeling problems,resulting in reduced performance of relation extraction.The current mainstream distantly-supervised relation extraction method usually uses deep learning to extract features.On this basis,some work has been done to improve and enhance the overall model's ability to recognize noise.Such as various attention mechanisms,model denoising,the fusion of external information,attention to overlapping relations,etc.There is also work to proceed from the sentence structure itself and to improve the overall model performance by reducing internal noise in the step of extracting features.However,the current distantly-supervised relation extraction still does not fully consider the influence of entities on the sentence structure during feature extraction,which has limited text semantic and structural information available for feature extraction.Besides,due to the tag noise caused by distant supervision,the robustness of the model needs to be considered in relation extraction.Aiming at the problem that entity-related sentence structure information is not fully utilized,this thesis proposes a distant supervision relation extraction model based on sentence segmentation feature.In this model,the sequence features are segmented,and the global information of the segmented sentence is obtained by the way of mean pooling,which is then fused into the segment to which it belongs.This fusion method considers the structural information of entity pairs in sentences,enhances the local representation of words and the context feature representation after sentence segmentation,and enriches the structural information and other potential information of sequence features in texts.For the problem that the noise data brought by distant supervision affects the performance of the model,this thesis proposes a distant supervised relation extraction model that introduces transfer learning.Since the way of randomly initializing parameters affects the efficiency and effectiveness of model training,the model uses transfer learning to obtain a priori knowledge of coarse-grained noise classification from the source task.Target task initializes shared parameters,which improves the performance of fine-grained partition relationship type of the target task,and enhances the robustness of the model's relation classification in low-quality corpora.In this thesis,an experiment is conducted on a widely used dataset NYT.The experimental results show that the above two models are better than baseline models in three commonly used evaluation indicators,and the performance is better.
Keywords/Search Tags:Distant supervision, Relation extraction, Sentence segmentation feature, Transfer learning
PDF Full Text Request
Related items