Font Size: a A A

Research On Posterior Regularization For Relation Extraction With Distant Supervision

Posted on:2021-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhangFull Text:PDF
GTID:2428330623469122Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Relation extraction is an important task in natural language processing of computer science.It can provide services for many downstream tasks,such as Q&A and knowledge base population.Since the construction of the relation extraction model needs a large amount of training data which is expensive,distant supervision for relation extraction becomes the current research hotspot.To solve the noise data problem of distant supervision and the diversity restrained problem caused by the current mainstream model based on selective attention for relation extraction,we use the posterior probability regularization framework to integrate the expert knowledge of relation categories,and propose two corresponding solutions to alleviate these two problems:(1)We use the posterior regularization framework to integrate the knowledge of human experts in the field of relation extraction into the instance selection policy,and propose a rule-based instance selection policy to improve the training efficiency of the instance selection policy,reduce the meaningless exploration in the training process of the policy gradient method,and improve the performance of relation extraction model trained on the selected data.This method has also achieved the state of the art in the field.Because the rule-based selection strategy can dynamically determine which instance is retained in each bag,the quality of the bag can be evaluated by the number of retained instances.(2)We analyze the characteristics and limitations of the previous research work on distant supervision for relation extraction,and propose the diversity inhibition problem caused by the previous research work.To solve this problem,we propose a new algorithm framework which uses clustering algorithm to dynamically construct the bag and proposes two reliability factors to evaluate the reliability of clustering bag by combining the expert knowledge of relation categories.At the same time,we use the posterior regularization framework to constrain the posterior probability and loss function,which not only enriches the diversity of the bag,but also reduces the noise data problem.In the experimental part,we construct a new evaluation dataset for sentence-level relation extraction.We design a proof experiment to prove the universality of the diversity restraining problem,and compared the main performance metrics of our methods proposed in this chapter with the previous research work that had influence on the sentence level relation extraction.In the new sentence level relation extraction evaluation dataset and existing open authoritative dataset,our method has achieved the state of the art performance.The paper describing the rule based instance selection policy proposed in this paper is employed by the top conference in natural language processing field,NAACL2019.The relation extraction system based on this algorithm has won the first place in National Institute of Standards and Technology TAC-DDI 2018 competition.
Keywords/Search Tags:distant supervision, relation extraction, posterior regularization, reinforcement learning, clustering
PDF Full Text Request
Related items