Font Size: a A A

Research On Automatic Construction Of Domain-specific Knowledge Graph

Posted on:2019-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y X LiFull Text:PDF
GTID:2428330566498640Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous improvement of computer performance and the rapid growth of Web information,it is a trend to make the Web Information structured and knowledgeable,construct semantic network composed of concept,entity and relation,which named Knowledge Graph.According to the scope of knowledge,knowledge graphs can be categorized into open knowledge graphs and domain knowledge graphs.Since 2012,Google has applied the knowledge graphs to search engines,open knowledge graph has made great progress in both the industrial and academic fields.However,compared with the development of open knowledge graphs,domain knowledge graphs do not get full researched.Due to the structure requirements specificity,construction of domain knowledge always faces the problems such as lack of marked corpus,too much labor cost and so on.In order to solve these problems,we take the construction of financial knowledge graph as an example,research on automatic construction of knowledge graph.The main contributions of this paper are as follows:In order to solve the problem of domain knowledge graph construction,we take the construction of financial knowledge graph as an example to explore the general method of constructing domain knowledge graph.We establish a whole set of domain knowledge graph construction process include distributed crawler,crowdsourcing annotation platform,named entity recognition algorithm and relation extraction algorithm.In order to solve the problem of named entity recognition in financial field,we research on method of improving domain named entity recognition based on CRF and Bi LSTM algorithm.In order to solve the problem that lacks labeled data,we use the method of combining active learning with CRF,with a small amount of manual tagging,the F-score of CRF is increased to 91.46%.We also use the way of pre training domain word vector as the input of Bi LSTM+CRF model,and get 91.76% F-score.In order to solve the problem of relation extraction in financial field,we extract five relations,such as merger,acquisition,holding,transfer and investment.Through the construction of word features,location features,grammatical features,we abstracting the problem into machine learning classification problem.We compared some traditional classification algorithms,and use Deep Forest model to min deep combination characteristics.Through experiments,we find that all classification tasks F-score were improved.
Keywords/Search Tags:knowledge graph, named entity recognition, relation extraction
PDF Full Text Request
Related items