Font Size: a A A

Memory-efficient Graph Store For RDF Query System

Posted on:2021-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:N WangFull Text:PDF
GTID:2518306503474114Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As a framework for representing graph data,the Resource Description Framework(RDF)has been widely used in knowledge bases,social networks,financial risk control and other scenarios.Users could retrieve information in the dataset by RDF query systems.With the arrival of the big data era,graph datasets in real world are becoming increasingly large.Improving the storage efficiency of RDF query systems under the premise of ensuring system performance has become an urgent problem with practical significance.However,most of the previous work focused on solving the performance problems of RDF query systems,and lacked attention to the optimization of system storage efficiency.In this thesis,we present a memory-efficient graph store for RDF query systems.To reduce unnecessary memory overhead,five optimization technologies with structural and encoding approaches are proposed and applied to the graph store.Firstly,we propose a segment-based graph store.The graph store is logically divided into many independent key-value stores,and key-value pairs are stored in different logic stores according to their edge types.By doing so,the structural redundancy in the header region is reduced.Secondly,we propose a selective key-value separate storage strategy,which stores entries with single value into corresponding headers and brings more efficiency.Thirdly,a fast entry deduplication technology is proposed to eliminate duplicate values in the graph store.Through data fingerprinting,a fast initial filtering is achieved.Utilizing the semantic features of segment-based graph store,the scope of deduplication is limited to segments,which greatly reduces the amount of computation and enables the deduplication procedure to be fully parallel.Finally,the locality and consecutiveness features of values are exploited by our encoding approaches,which brings significant improvement on storage efficiency.In addition,we propose a hierarchical encoding mechanism,which separates different types of data into different encoding levels according to data access frequency.Users could reach a better balance between storage efficiency and system performance by switching to different encoding levels.We implement the memory-efficient graph store on the state-of-the-art RDF query system,Wukong.Evaluation on three datasets shows that the memory-efficient graph store reduces 50% memory cost at average while brings negligible side effect on system performance.
Keywords/Search Tags:RDF, Graph Query System, In-Memory Graph Store, Data Compression
PDF Full Text Request
Related items