Font Size: a A A

Research On Hierarchical Relation Extraction Under Distant Supervision

Posted on:2022-11-13Degree:MasterType:Thesis
Country:ChinaCandidate:E X YuFull Text:PDF
GTID:2518306758980289Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Relation extraction aims to extract the relation between entities from unstructured text,which is an important research direction in natural language processing.In the era of deep learning,relation extraction under supervised learning has been able to achieve high accuracy,but it requires a large number of labeled training data.However,in the real world,there are many kinds of relations and complex text data.Acquiring large number of labeled training data is labor intensive.To solve the problem of insufficient labeled training data,distant supervision is proposed.Distant supervision makes use of the existing knowledge bases and proposes a strong hypothesis that if two entities in the existing knowledge bases express a relation,then all sentences containing these two entities express the same relation.Although such a strong hypothesis can label a large amount of unstructured text as training data for relation extraction,it also brings two problems,namely,the problem of mislabeling and the problem of long-tail relations.Aiming at the two problems of relation extraction distant supervision,this paper tries to use the hierarchical information of relations to extract relations under distant supervision,and proposes a hierarchical bag representation strategy and a top-down relation classification strategy.The specific research contents are as follows:(1)Sentences containing two identical entities are divided into a bag and labeled with the same relation.The purpose is to predict the relation contained in each bag.This paper uses hierarchical relation information for hierarchical classification,while the semantic relations contained in different relational levels are different.A hierarchical bag representation is proposed to solve this problem.This paper firstly utilizes the entityaware embedding method to enhance the entity information in the text representation.Second,the PCNN is adopted to extract the features of bags.Finally,this paper proposes a hierarchical attention mechanism by learning weights in different relation layers.Then the bag representations in different relation levels are obtained,which lays the foundation for hierarchical relation extraction.(2)To solve the problem that hierarchical relation extraction need training a large number of classifiers,a top-down relation extraction strategy is proposed.In the field of relation extraction,the number of relations is large,so hierarchical relation extraction requires a lot of local relation classifiers.Therefore,this paper proposes a top-down relation extraction strategy.Classifiers in different relation levels can share the parameters through the strategy.The strategy consists of two parts,that is,the bag representation model fused with local label information and the local relation extraction method based on the label matching model.The number of relational classifiers is greatly reduced by sharing the same parameters under different relation levels.(3)This paper has done sufficient experiments on the classic dataset NYT under distant supervision.The experiments show that the proposed relation extraction method in this paper is 4% higher than the previous best method in terms of the accuracy.As for the long-tail relation extraction,this paper is 27.6% better than the previous best method.The experimental results show that the hierarchical relation extraction method under distant supervision can effectively solve the problem of mislabeling and long-tail relations,and improve the accuracy of relation extraction.
Keywords/Search Tags:Natural Language Processing, Knowledge Graph, Distant Supervision, Hierarchical Relation Extraction
PDF Full Text Request
Related items