Font Size: a A A

Research On The Prediction Of The Best Graph Partition Strategy By Using Machine Learning Techniques

Posted on:2020-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:J Y ShenFull Text:PDF
GTID:2428330596468161Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The traditional method of partition the graph is to cut the edges,in order to minimize the number of cutting edges.In order to run graph algorithms on distributed systems,the better approach is to partition the graph based on the edge partition,which is to cut the vertices to keep the balance of edges in each partition.However,since the graph edge partitioning strategies has already been proposed in recent years,few people have evaluated the different partitioning strategies that have been proposed.In order to make up for this defect,this paper designs a number of metrics based on edge partition and use them to evaluate the partitioning quality after applying the partitioning strategies to specific large-scale graph.The main contributions of this paper are as follows:Multiple metrics based on edge(vertex-cut)partitioning are proposed innovately.The vertex-cut partitioning strategy pays more attention to the balance and the communication cost of the graph partition.The metrics we proposed cover these two aspects and can effectively describe the partitioned graph.We proved features of partitioned graph have strong relationship with the runtime of the graph algorithm.After the partition of the graphs with different structures,the metrics values are different,and the graph algorithms(such as PageRank)have been proved to depend on the metrics.The experiment show that the communication metrics are most useful predictors.Machine learning algorithms can be used to predict the best partitioning strategy for a given graph.We have designed two different schemes,using statistical analysis methods and machine learning methods to build models,which can predict the best partition model of a given graph with high accuracy,and shorten the total time notably.
Keywords/Search Tags:graph metrics, distributed system, best partitioner, machine learning
PDF Full Text Request
Related items