Graphs have played an important role all over the world,and their dynamic and temporal nature is attracting more and more attention from both academic and industrial communities.How to integrate large-scale dynamic graph data with real-time processing has become an emerging technical challenge.Existing solutions have not completely solved the performance and related problems of real-time processing on dynamic graphs,and there is still remarkable room for improving real-time streaming graph processing,online graph transaction processing,and general scalability solutions.In view of the above problems and shortcomings,this study focuses on the direction of real-time dynamic graph processing,and the main contributions include:1.For streaming processing on dynamic graphs,this study designs and implements a real-time streaming system,RisGraph,which adopts the core design ideas of localized data access and inter-update parallelism,and applies a series of techniques proposed based on the above design concept to be able to process a class of monotonic graph algorithms in real-time in a generic way.Experiments show that RisGraph improves throughput by 2-4 orders of magnitude in real-time processing scenarios compared to existing streaming graph processing systems,and still achieves significant speedup in batch update scenarios.On dynamic graph data with hundreds of millions of vertices and billions of edges,RisGraph can process millions of updates per second with per-update analysis without using batch updates,and ensures that 99.9%of updates can be processed in real-time within 20 milliseconds.2.For online transaction processing on dynamic graphs,this study proposes two types of lightweight materialized views,virtual properties and path folding,which have a small space footprint and low maintenance cost,can be maintained with constant time overhead under some constraints,and can update views in real time during read and write transactions in graph databases,thus providing consistent and upto-date subquery results.Experiments show that the above materialized views can provide significant performance improvements with low maintenance costs for a range of real-time graph queries in LDBC SNB Interactive workloads,with a maximum speedup of over 300×,and can improve the overall throughput of dynamic graph online transaction processing by 60 to 72 times.3.To address the scalability problem of existing cache systems,this study designs and implements a general-purpose,high-performance,user-transparent cache system,TriCache,that can generally support a wide range of graph workloads and big-data processing workloads.TriCache extends existing in-memory programs to out-of-core processing by providing a virtual memory interface on top of the userspace block cache,and proposes a multi-level Software Address Translation Cache and Hybrid Lock-free Delegation to optimize the in-memory and out-of-core processing performance to exploit the high performance provided by modern storage devices.Experimental results show that TriCache can help in-memory programs to efficiently utilize high-performance storage devices to process larger datasets without manually rewriting code.Compared to existing solutions,TriCache outperforms operating system page caches by several orders of magnitude and matches or exceeds the performance of specialized out-of-core systems. |