Font Size: a A A

Temporal Graph Storage Model Based On Log-structured Merge-subgraph

Posted on:2022-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:S LiuFull Text:PDF
GTID:2480306572490944Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Real-world graphs are not only large in size but also changing constantly in structure.The originally static graphs evolve into temporal graphs with the time dimension.How to both store and query the temporal graph efficiently has been more and more significant.The existing models of temporal graphs are distinguished into three general categories:snapshot-based models,log-based models,and hybrid models.The snapshot-based model does well in efficient queries but not in storage.The log-based model reduces the storage redundancy but introduces extra overhead at query time.The hybrid models based on skewness-aware or relativity-aware combines snapshot-based and log-based models.However,there are still some challenges of hybrid models: The skewness of degrees will evolve over time;The skewed access frequency cannot completely represent the importance of temporal graphs;The relativity among snapshots will lead to more updated cost.Temporal graph storage model based on Log-Structured Merge-Subgraph contains two components: LSM-Subgraph data structure and the update query engine.LSM-Subgraph splits all logs into multiple data shards,and each shard has a log array and an adjacency array based on Packed-Memory Array(PMA).The adjacency array based on PMA handles the reconstruction problem by reserving some space for updating.The function of the update query engine is to accomplish some operations during the update and query.The engine uses a select-timepoint method based on discrepancy-aware to divide shards during the update,and utilizes a merge-log method to reduce the size of logs during the query.The experimental results show that the overall performance of LSM-Subgraph is better than both Chronos and Graph Pool over real-world datasets.Specifically,it improves the query by from 39% to 94%,and reduces the storage overhead by from 9% to 81%.In addition,experimental results of optimized schemes show that the updated cost of the adjacency array based on PMA is much less than both CSR and adjacency list;The overall performance of the select-timepoint method based on discrepancy-aware is better than the method based on skewness-aware.To be specific,it reduces query time by from 9.8% to25%,and decreases the storage cost by from 12.8% to 48.7%;The overall performance of the method based on discrepancy-aware is better than methods based on both randomness and periodicity.Specifically,it improves the query by from 2% to 80% and saves the storage space by from 24% to 63%.
Keywords/Search Tags:Temporal graph, Graph storage, Graph query, Packed-Memory Array, Discrepancy-aware
PDF Full Text Request
Related items