Font Size: a A A

Research On I/O Management Method Based On Awareness Of Partition Association For Random Walk

Posted on:2022-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:S ChenFull Text:PDF
GTID:2480306572490894Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Random walk is a fundamental technique to analyze large graphs,it forms a foundation for many important graph measuring,ranking and embedding algorithms.It is widely used in graph data analysis and machine learning.However,the current general graph processing systems do not fully take into account the random walk features and adopt the iterationbased I/O model,which limits the efficiency of the random walk application.The current special graph processing systems for random walk adopt the state-aware I/O model without considering the data active state and partition association state in the process of random walk,which leads to a large amount of external storage I/O and low I/O utilization.In order to improve system I/O utilization and reduce the times of I/O,the influences of data active state,partition association state on system I/O performance in the process of random walk are analyzed.(1)Considering the active state of data,a mixed granularity graph partitioning method based on vertex activity is proposed.Coarse-grained partition is used for active data,and fine-grained partition is used for inactive data to reduce the massive I/O of active data and reduce invalid loading of inactive data to improve I/O utilization.(2)Considering the partition association state,a graph partition scheduling strategy based on awareness of partition association is proposed.The partition that has the closest association with the partition in memory is loaded to make full use of the data of other partitions in memory,so as to further improve the I/O utilization and speed up the execution of random walk applications.MPAWalker,a graph processing system for random walk based on awareness of mixed-grain partition association,is designed and developed.The experimental results show that the performance of MPAWalker is improved compared with GraphWalker under different graph data sets and different random walk algorithms.The execution time of the system is reduced by 3.01% to 69.98%.The I/O times and I/O utilization of MPAWalker and GraphWalker are compared.The I/O times decreased by 41.02% on average,and the I/O utilization increased by 38.34% at the highest,which effectively improved the system I/O performance.
Keywords/Search Tags:Random Walk, Graph Processing System, I/O Management, State-aware, Mixed-grain
PDF Full Text Request
Related items