Research On Training Example Selection In Distant Supervision For Relation Extraction

Posted on:2021-04-27

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y C Gui

Full Text:PDF

GTID:1488306557985109

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Distant Supervision for Relation Extraction produces large-scale labeled data automatically under the supervision of existing knowledge bases,thus reduces the reliance on manual annotation.However,the exis-tence of noise in these automatically labeled data may hurt the performance of distant supervision for relation extraction if the data is used for training directly.Training example selection is an important task to address the problem of noise data in distant supervision for relation extraction.It selects training examples with cor-rect labels from the training set to reduce the impact of noise on the performance of relation extraction.There are two types of training example selection approaches,i.e.,implicit and explicit.The implicit training example selection approaches are mainly based on Probabilistic Graphical Model(PGM)and Deep Neural Network(DNN).The former estimates confidence scores of training examples by hidden variables,and examples with high confidence scores are used for training.However,other correct training examples cannot be fully utilized.The latter uses the attention mechanism to adjust the weights of training examples to reduce the impact of noise on the relation extraction model.However,the noise cannot be removed from the training set directly.The explicit training example selection approaches are mainly based on domain knowledge and Rein-forcement Learning(RL).The former utilizes a single type of domain knowledge and cannot comprehensively utilize multiple types of domain knowledge.The latter mainly employs on-policy reinforcement learning al-gorithms,and there is a lack of systematic study of off-policy reinforcement learning algorithms for this task.In order to address the problems of training example selection approaches in distant supervision for relation extraction,we propose the following solutions.(1)With respect to the problems in implicit training example selection approaches,an explicit approach based on Explanation-based Learning(EBL)is proposed for the first time.It employs the Answer Set Pro-gramming(ASP)language to represent domain knowledge and training example selection rules.The EBL algorithm is further improved to learn ASP rule sets using imperfect domain knowledge.This approach can make full use of the correct training examples and remove noise from the training set.The experimental re-sults show that this approach can learn ASP rules for training example selection effectively and achieve an improvement of 30%in recall compared with the baseline using PGM-based method.(2)With respect to the problem of conflicts between multiple types of domain knowledge,an explicit approach based on Markov Logic Network(MLN)is proposed.It includes a novel MLN model to capture relationships between different types of domain knowledge for training example selection.The experimental results show that this approach can select effective domain knowledge for different relations and achieve an improvement of 22%in average F₁on New York Times(NYT)data set and of 27%in average F₁on Wikipedia data set compared with the baseline using single type domain knowledge.(3)With respect to the problem of lacking of systematic study of off-policy reinforcement learning al-gorithms for training example selection,an explicit approach based on Deep Q-Network(DQN)is proposed.The performances of off-policy reinforcement learning algorithms in the training example selection are stud-ied systematically.In off-policy RL algorithms,a Top-k behavior policy is used for the first time to generate more effective experiences.The experimental results show that this approach can effectively learn training sample selection policies from trial-and-error experiences without domain knowledge and manual annota-tion.In addition,off-policy RL algorithms improve the convergence speed by 6 times without degrading the performances of training example selection compared with on-policy RL algorithms.

Keywords/Search Tags:

Distant Supervision for Relation Extraction, Training Example Selection, Explanation-based Learning, Markov Logic Network, Reinforcement Learning

PDF Full Text Request

Related items

1	Research On Relation Extraction Based On Distant Supervision Labeled Data
2	A Chinese Entity Relation Extraction Method Based On Distant Supervision
3	Reasearch On Relation Extraction Based On Distant Supervision
4	Research On Distant Supervision Relationship Extraction Based On Deep Reinforcement Learning
5	Design And Implementation Of Multi-Source Domain Relation Extraction Model Based On Reinforcement Learning
6	Research On Noise Reduction Of Relation Extraction Data Based On Reinforcement Learning
7	Research On Distantly Supervised Relation Extraction Based On Deep Learning
8	Neural Relation Extraction Based On Distant Supervision Approaches
9	Research And Application Of Relation Extraction Based On Distant Supervision
10	Study On Denoising Algorithm For Distant Supervision Relation Extraction