Font Size: a A A

Research On Relation Extraction In Specific Domain Based On Prior Knowledge

Posted on:2021-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:K AnFull Text:PDF
GTID:2518306461970309Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the research of knowledge graph in specific fields,the construction of domain knowledge graph gradually becomes a research hotspot.However,due to the complexity of the domain expertise and the small application range in some specific fields,the domain relation extraction task lacks the relation extraction model suitable for the field and the domain relation extraction dataset.Moreover,the current industry priori knowledge accumulated in a specific field plays a limited role in the task of knowledge graph construction and relation extraction.We take the vocabulary information and triple knowledge obtained from the domain corpus text and the existing knowledge base as domain prior knowledge,then propose a relation extraction model combining domain prior vocabulary.And taking the field of metal materials as an example,a relation extraction dataset is constructed based on domain triple knowledge and distant supervision data annotation method.We has done the following research work.1)We propose a relation extraction model combining domain prior vocabulary.In the task of relation extraction,a specific relationship category can often find a series of relationship description vocabulary that can express the relationship category in the text sentence,so as to determine the relationship category through the relationship description vocabulary.Therefore,we take the relation description vocabulary that is highly correlated with the relation category in the corpus text as the domain relation prior vocabulary,and use the deep learning method to fuse the prior vocabulary into the convolutional neural network model to assist the relation extraction model for relation classification tasks.The model evaluation experiment results show that the application of domain prior vocabulary knowledge improves the relationship extraction performance of the model,indicating that the model can be applied to the task of relationship extraction in a specific domain.2)We design a method for constructing domain relation extraction dataset based on distant supervision.Aiming at the problem of labeling relational data in a specific domain,we designed domain text and triples extraction algorithms to obtain a large number of domain texts and domain triples knowledge,and used open information extraction tools to extract triples knowledge in the domain corpus.Finally,the distant supervision data annotation method is used to label the domain text to obtain a specific domain relation extraction dataset.During the experiment,we took the field of metal materials as an example to construct a metal materials relation extraction dataset,which contained more than 20,000 relation-labeled texts.3)According to the different noise types of the specific domain relation extraction dataset,we designed different noise reduction optimization processing methods.In the process of constructing the domain dataset,since the acquired domain corpus contains non-domain text and the remote supervision method may generate wrong annotations,the domain dataset generates two kinds of noise data: domain-type noise and supervised-type noise.For domain-type noise data,a text classification method is used to reduce the impact of noise;for supervised noise data,a reinforcement learning model is used to identify false labels.According to the results of the noise reduction optimization experiment,the quality of the data set and the performance of the relation extraction model are improved after the optimization of the two noise processing methods.
Keywords/Search Tags:Specific field, Relation extraction, Prior knowledge, Convolutional neural network, Distant supervision
PDF Full Text Request
Related items