Research On Distributed Storage And Retrieval Technology Of Large-scale Knowledge Graph

Posted on:2020-02-22

Degree:Master

Type:Thesis

Country:China

Candidate:C Peng

Full Text:PDF

GTID:2428330590983211

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Distributed storage is a way to cope with the rapid growth of data collection.With the rapid expansion of the data scale of knowledge graph,it is bound to face the storage problem of data collection.The current distributed algorithm of segmentation of non-relational data sets can lead to problems such as storage load and uneven distribution of relationship node density.On the basis of the distributed relational data set,combined with the characteristics of the knowledge map data structure,this paper uses the method of lexical semantic similarity and node average degree to segment the data set,and considers the load balancing and the redundancy of cross-server node relationship.The sub-graph retrieval combines the characteristics of this data structure segmentation,and uses the node degree decrement pruning method to disassemble the query sub-graph into multiple sub-trees with height 2,and compare with the sub-trees.The system structure design is divided into two parts.In the distributed storage part,the data set is partitioned according to the partitioning method proposed in this paper.In the distributed storage and sub-graph retrieval part.The main steps are as follows: the sub-graph to be processed first is divided into sub-trees including only the root node and the leaf node height 2 according to the characteristics of the data set distribution,and then the root nodes of all the sub-trees are queried to obtain the tree that include all direct relationships,then compared with the sub-tree of the query,and finally the result of the query is obtained.This system dataset uses simulated data as experimental data.The sub-graph retrieval experiment results show that under the data set segmentation method adopted in this paper,the distributed storage sub-graph retrieval takes less time than the commonly used hash distributed storage method;at the same time,in the case of node relationship redundancy,combined with the characteristics of the graph query,the query time in redundancy is less than the non-redundancy case.The redundancy of the hash distributed method is larger than the data set segmentation method adopted in this paper.More amount of redundant data and storage space are needed.It also takes more time to consume when the graph is queried.

Keywords/Search Tags:

Knowledge Graph, Distributed, Sub-graph Matching, Semantic Similarity, Pruning Rule

PDF Full Text Request

Related items

1	Research On Semantic Similarity Graph Query Over Knowledge Graphs
2	Semantic Similarity Calculation Based On Small-scale Knowledge Graph
3	Research On Semantic Query Algorithm In Knowledge Graph
4	Semantic Graph Knowledge Repository Designed For KID Cognitive Model Applied In Retail Business
5	Research Of Subgraph Query On Knowledge Graph
6	Research On Knowledge Graph Completion By Combining Structural And Semantic Information
7	Research On Semantic Based Knowledge Graph Cleaning And Optimization Technology
8	System Design And Development Of The Subject Of Biology Q&A Based On Knowledge Graph
9	Distributed Queries On Massive Knowledge Graphs
10	Research On Video Recommendation Based On Knowledge Reasoning Of Knowledge Graph