Font Size: a A A

Research On Entity Relationship Extraction In Monolingual And Crosslingual Situations

Posted on:2021-07-20Degree:MasterType:Thesis
Country:ChinaCandidate:M Y WangFull Text:PDF
GTID:2518306245981909Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the development of modern information technology,the era of "big data" has arrived,and the amount of data generated by human society is increasing.How to quickly and efficiently extract effective information from these unstructured text data is an important issue currently facing people.Entity Relation Extraction(ERE)as the core task of text mining and information extraction,mainly by modeling text information,automatically extracts the semantic relationships between entity pairs in sentences,and extracts effective semantic knowledge.It has been widely used in tasks such as machine translation,automatic question answering,text summarization,semantic web annotation,and knowledge graph construction.In the past,most relation extraction models mainly focused on entity relation extraction in monolingual data(mainly in English with rich annotation resources).By establishing various types of neural network relation extraction models mainly in English,it achieved good results.For languages where tagging corpora are relatively scarce(such as Japanese,French,etc.),it is more difficult to establish an effective relation extraction model because manual tagging datasets are expensive and time-consuming to obtain,and it is difficult to exclude tagging noise from remotely monitored datasets.In response to the above-mentioned problems,one is structure types of the single-language entity relation extraction model are diverse,and the other is the entity relation extraction model is difficult to build on a language with scarce corpora.This paper proposes two main research questions: in a single-language context,we study the effects of different network structure encoders and selectors in empirical research on relation extraction,and analyze the impact of the internal mechanism of the model;in a cross-lingual context,the source language has a rich tagging corpus,and the target language has a relatively scarce tagging corpus.We use cross-language knowledge to build a bridge between the source language and the target language,then we can obtain the labeled corpus of the target language,and use it to establish a cross-language entity relation extraction model.Specifically,the main work of this article includes the following two points:(1)We propose a model of entity relation extraction based on deep learning in a single-language context,and explore the performance of distant supervised relationship extraction in two languages,English and Chinese.The single-language entity relationship extraction mainly consists of an encoder and a selector.The network structure of the encoder includes four types: Convolutional Neural Network(CNN),Piecewise Convolutional Neural Network(PCNN),Long Short-Term Memory Network(LSTM),and Gated Recurrent Unit(GRU).The selection method of the selector includes three types: attention mechanism,maximum selection and average selection.On the English distant supervised relation extraction dataset NYT and the Chinese distant supervised people relation extraction dataset CCKS2019-IPRE,we explore the impact of different encoders and selectors on the model performance in the relation extraction model.(2)We propose a Cross-Lingual Adversarial Relation Extraction(CLARE)framework,which decomposes cross-lingual relation extraction into parallel corpus acquisition and adversarial adaptation relation extraction.Through dictionary expansion or self-learning methods,the source language relation extraction data set is converted into the target language data set.On this basis,the feature representation of the source language is transferred to the target language using adversarial feature adaptation,and then the target language relation extraction network obtained by training is used to classify the target language.The method in this paper is applied to the English-Chinese,Chinese-English cross-lingual relation extraction tasks based on the ACE2005 multilingual dataset.The F1 values of the optimal models on the two tasks are 0.8801 and 0.8422 respectively,indicating that the proposed CLARE framework for cross-language adversarial relation extraction can significantly improve the effect of low-resource language entity relation extraction.The research results are of great significance for improving the relation extraction model in the cross-lingual context and promoting the application of relation extraction research in the field of information science.
Keywords/Search Tags:entity relation extraction, single-language, cross-language, GAN, deep learning
PDF Full Text Request
Related items