Font Size: a A A

Chinese Knowledge Graph Construction Method Based On Multiple Data Sources

Posted on:2016-07-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:F H HuFull Text:PDF
GTID:1228330467476656Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
More and more linked open data and user generated contents are published on the Web after the Semantic Web has been proposed; the web is now changing from the web of document to web of data, which contains abundant entities and relations. Knowledge graph has been first proposed by Google, which is focused on describing various real world entities and concepts, and their relations. Knowledge graph is a new vision of ontology; a knowledge graph extends an ontology in the entity level. Ontology usually focuses on concepts and their relations, which specifies the schema of knowledge graph; and knowledge graph adds a large number of entities into ontology. Knowledge graph is widely used in semantic search, intelligent question-answering, knowledge engineering, data mining, and digital library.This thesis studies knowledge construction from multiple data sources based on the current research achievements about knowledge graph and ontology construction. The main works and contributions of this thesis are as follows,1. This thesis exploits the structured and semi-structured data on the web for knowledge construction, such as linked open data, online encyclopedias, and domain web sites, which have high coverage and quick update speed. This thesis explains how to extract and learn knowledge from them, and how to ensure the quality of the constructed knowledge graph.2. We study on construct knowledge graph from multiple data sources and combine the advantages of them, including the good precision of the data in relation databases, the high coverage of the linked open data and public knowledge bases on the web, and the depth of the domain oriented data. The thesis proposes a knowledge construction method from these multiple resources, and ensures the precision of the learnt knowledge graph based on the redundancies of different resources.3. This thesis also explains how to exact knowledge from large-scale web text and proposes an open relation extraction method based on self-supervised learning. The extracted relations include synonym, hyponymy and attribute relations among concepts and entities. The main advantage of the method is that it labels training samples automatically by using knowledge extracted from structured data or semi-structured data and some general heuristic rules. In order to obtain text automatically, this thesis also proposes a heuristic rule based web information extraction algorithm to extract main content of web pages.4. For domain knowledge graph construction, this thesis also focuses on how to use the domain structured data and designs a mapping language which specifies how to map data in relation databases into knowledge in knowledge graph. This thesis also studies how to automatically discover domain data source such as open domain knowledge bases and websites on the web, and proposed a corresponding algorithm.5. This thesis also develops an online collaborative knowledge graph edit platform which tries to leverage crowd-wisdom for knowledge graph generation. The main advantages of the platform are its ability for concurrent editing, and it could combine with the automatical learning algorithms.Finally, we construct a general knowledge graph with7,392,384entities and60,842,064facts based on the proposed algorithms. While comparing the constructed knowledge graph with other knowledge databases and data sets, we find it has good coverage; meanwhile, the average precision of the knowledge is above95%. Moreover, we construct a domain knowledge graph about fishes which contains more than32thousands fish species; its good coverage comes from the usage of existing most complete data sources.
Keywords/Search Tags:Knowledge Graph Construction, Ontology Learning, Linked Open Data, Entity, Self-Supervised Learning
PDF Full Text Request
Related items