Font Size: a A A

Geographic Domain Concept Relation Extraction Of Primary Education Based On Transfer Learning

Posted on:2018-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:N WangFull Text:PDF
GTID:2428330596954784Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The thesis chooses geography of primary education as the research object,to study the concept relation extraction in the geographic domain.However,there is no sufficient corpus in the geographic domain to support the concept relation extraction research currently.The conventional solution is to expand the corpus manually,but the labor cost is too much.Alternatively,to solve such problem,several transfer learning based methods have been proposed to transfer the knowledge from the source domain to the target domain.Therefore,this thesis studies how to solve the problem of insufficient geographic corpus based on transfer learning,such that the accuracy of the geographic concept relation extraction can be improved.The main work is as follows:1)According to the temporal characteristics of geographic text of primary education,the LSTM neural network is used to construct a concept relation extraction model based on word features and sentence features.The geographic concept relation extraction is defined as sentences semantic relation classification between pairs of concepts.Word feature is the basic feature of concept pairs,and it's extracted by Word Embedding.The LSTM is suited for modeling the sequence data because it can make use of the long distance dependence information effectively.As sentence feature can express the semantic information of sentence sequence completely,so the LSTM is used to extract the sentence feature of sentence with concept pairs.The accuracy of the method is reduced in the case of insufficient geographic concept relation corpus.2)According to the lack of geographic concept relation corpus,the LSTM-based transfer learning method is proposed,which transfers the knowledge of the open domain to the geographic domain,to assist the geographic domain improves the accuracy of concept relation extraction.The method consists of two parts: the Sogou word vector is trained by a large number of data,compared with the word vector which trained by a small amount of geographic text,its feature representation is more accurate.Therefore,a word vector transfer learning method based on Word Embedding is proposed firstly,which transfers the Sogou word vector to the geographic domain for comparative experiments.As the feature space of the word vector after transferring is inconsistent,the accuracy improvement of this method is not obvious.Thus,a new transfer learning method based on network weights is proposed,which first transfers the LSTM network weights trained by the open domain text to the geographic domain.And then,the geographic domain handles the transferred network weights in two aspects for comparative experiments: keep frozen and retraining fine-tuned.The experimental results show that the accuracy of geographic concept relation extraction can be significantly improved when the transferred network weights retraining fine-tuned with the geographic text.However,this method can't solve the Domain Adaptation problem which caused by the inconsistent data distributions in the process of knowledge transferring.3)To address the Domain Adaptation problem which produced in the process of knowledge transferring from the open domain to the geographic domain,a transfer learning model of the geographic domain based on multi-latent feature space layer is constructed.Firstly,the identical concepts between the open domain and the geographic domain are used to generate one common latent feature space.And then,the different concepts are used to generate three specific latent feature space for the open domain and the geographic domain respectively.Finally,the three specific latent feature spaces are combined with the common latent feature space into three latent feature space layers,which used to learn the data distributions simultaneously.Furthermore,the NMTF is proposed to solve the optimization problem.This method can slightly solve the Domain Adaptation problem between the open domain and the geographic domain.
Keywords/Search Tags:Geographic Domain, LSTM, Concept Relation Extraction, Transfer Learning
PDF Full Text Request
Related items