Font Size: a A A

Knowledge Graph Enrichment And Error Detection Techniques

Posted on:2020-12-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:Full Text:PDF
GTID:1368330626964426Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The significance of knowledge graphs is self-evident,because knowledge graphs have become one of the most influential data sources and knowledge representation tools.Nowadays,everything has been evolving rapidly in the big data epoch.Therefore,knowledge graph enrichment and error detection have been strongly attracting to many researchers.The core task of this research can be viewed as understanding the knowledge from two different aspects,i.e.,external and internal data sources.However,there are many difficulties and challenges encountered from the research on knowledge inference.To summarize,this dissertation makes the following contributions:1.Web Table Understanding by Collective Inference:Traditional set similarity matching methods cannot match two sets which do not have overlapping elements.Because of both the incompleteness of the knowledge graph and the diversity of the content of the World Wide Web,there are many columns in the web tables and types in the knowledge graph cannot match with each others,even they are similar in semantics.In addition,small numbers of overlapping elements between the columns and the types can also lead to low quality of the matching results.Therefore,this research proposes a collective inference approach(CIA),which can not only infer the semantic types of unknown columns,but also greatly improve the top-k quality of the column-type detection,especially the results of the top-1 detection.At the same time,this work designs an effective semantics of column matching model for the column pairs,including feature extraction from the columns and an automatic method for training data generation.In consideration of the scalability of the data set,three reasoning strategies are also proposed to improve the overall inference efficiency.Finally,this chapter uses crowdsourcing to augment the entities into the knowledge graph,as well as verifying the quality of extracted entities.2.Web Table Schema Discovery and Knowledge Enrichment by Human-Machine Collaboration:In order to better understand the web tables,this research proposes a human-machine collaboration framework(HuMaC)to discover the semantic pattern of web tables.Due to high cost of the crowdsourcing in both money and time,we design an automatic machine method to generate high quality top-k pattern candidates in order to reduce the total number of crowdsourcing questions by leveraging the rank join algorithm to early terminate the pattern score calculation for nonsense patterns.Moreover,in order to improve utility of the crowdsourcing,this work designs a problem selection algorithm and introduces an error tolerance strategy by designing the form of the crowdsourcing questions.Finally,we extract a lot of new knowledge from the web tables based on their semantic patterns validated by human-machine collaboration to augment the knowledge graph3.Knowledge Base Error Detection with Relation Sensitive Embedding:In consideration of the inconsistency and incompleteness of knowledge graphs,this chapter proposes a relation sensitive embedding approach(RSEA)to conduct knowledge graph inference for the purpose of knowledge completion and error detection.In this work,we first design two correlation functions to measure the relatedness between two relations Then,a dynamic cluster algorithm is presented to aggregate highly correlated relations into the same clusters.Therefore,we can correct the deviation caused by inconsistencies of the knowledge graph by leveraging the correlation of the relations during the knowledge graph embedding process.At the same time,this method can be easily combined with traditional models to achieve better reasoning and predicting effects.
Keywords/Search Tags:Knowledge Graph Enrichment, Knowledge Graph Error Detection, Knowledge Graph Embedding, Crowdsourcing, Collective Inference
PDF Full Text Request
Related items