Font Size: a A A

Data Denoising And Visualization For Genealogy Knowledge Graphs

Posted on:2021-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:S J ShengFull Text:PDF
GTID:2428330614460379Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Knowledge Graphs(KGs)have powerful capabilities in semantic processing and open interconnecting,and have been widely used in Web retrieval,recommendation systems,and knowledge answering.Currently,it is still a great challenge for the automatic construction of knowledge graphs,which may inevitably produce a lot of noises.,In addition,for a noisy knowledge base,how to customize the display of information according to user needs is also worth further research.Genealogical data have the characteristics of being massive,multi-source,heterogeneous and autonomous,and contain rich structural and semantic information.We need effective methods to build genealogical data into knowledge graphs of kinship relationships to realize big knowledge mining and inference services of cross-surname genealogy,and analyze the associations between surnames,and the origins and changes of surnames.This thesis focuses on data denoising and visualization for big kinship knowledge graphs(big genealogical data).The main research contributions of this thesis are as follows.(1)A PSKM(Prior-Knowledge and Subgraph-Matching for Knowledge Graph Refinement)method proposed in this thesis for data denoising by using prior knowledge and exact subgraph matching.This method firstly uses priori knowledge to construct a knowledge base in the genealogical field and converts it into noisy pattern subgraphs.Then,it uses an optimized exact subgraph matching technology to achieve noisy data detection.The experimental results on kinship data sets show that PSKM can effectively improve the denoising accuracy of genealogical knowledge graphs and reduce the computing time.(2)This thesis proposes a CEPV(Customized Information Extracting,Processing and Visualization)tool that aims at solving the problem of how to quickly extract and process data,and implement customized information display for noisy knowledge graphs and user-specified visualization needs.First,CEPV uses batch data extraction rules to extract user-specified data from massive,complex,heterogeneous and fragmented graph data and store them according to the specified rules.Then,we add fault tolerance mechanisms and attribute judgment rules in the data processing process to ensure the correctness of data processing.Finally,CEPV uses data visualization tools to display processed data to users.The experiments prove that CEPV can provide a new and effective tool for the visualization of large-scale kinship knowledge graphs.
Keywords/Search Tags:Knowledge graph, Data denoising, Data visualization
PDF Full Text Request
Related items