Font Size: a A A

Research And Application Of Distributed Large-scale Graph Partitioning Algorithm

Posted on:2024-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:H J RenFull Text:PDF
GTID:2530306944460344Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Since entering the era of big data,graph has become one of the concepts of great interest,and many scientific problems can be modeled as graph-related problems.Due to the growing size of graph data,distributed graph computing systems are widely used for graph computing tasks.Graph partitioning is an essential process in distributed graph computing systems,which aims to slice the original graph into subgraphs and distribute them to compute nodes in a distributed cluster.The network communication overhead between nodes depends on the number of vertices or edges sliced between partitions,while the time overhead of the algorithm depends on the highest loaded node in the cluster.It can be seen that the different partitioning methods will seriously affect the performance of graph computation.Since traditional graph partitioning algorithms run slowly and are no longer applicable to large-scale graphs,distributed graph partitioning algorithms have become a hot research topic in recent years.Based on this background,the distributed large-scale graph partitioning algorithm is studied in this topic,and the research is divided into the following three main points.(1)A distributed large-scale graph partitioning algorithm based on label propagation,OLPGP,is proposed.First,the label propagation algorithm is optimized to fit the problem context of graph partitioning,and then an effective initial partitioning strategy is proposed to make relevant improvements to address the problems of label oscillation and easily falling into local optimality in the label propagation algorithm.The experimental results show that the OLPGP algorithm outperforms the current mainstream distributed graph partitioning algorithms in terms of partitioning quality and algorithmic efficiency.(2)The OLPGP algorithm is optimized for the dynamic nature of graph structure and computation nodes in the real world to make it applicable to dynamic graphs.Firstly,the concept of incremental vertices is defined,and corresponding rules are designed to identify incremental vertices for different scenarios where the network structure and computational nodes change,respectively.After the repartitioning of incremental vertices,the partitions that exceed the load are fine-tuned based on the migration gain of vertices to optimize the partitioning quality.The experimental results show that the solution proposed in this paper can effectively improve the division efficiency of dynamic graphs compared with OLPGP and other distributed graph division algorithms.(3)A large-scale network analysis system is designed and implemented.The system is based on a distributed graph computing framework,incorporating the graph partitioning algorithm proposed in this project and other network analysis algorithms,and developed using mainstream front-and back-end frameworks to provide user-friendly interaction.Finally,the effectiveness and stability of the system are verified by perfect functional tests and stress tests.
Keywords/Search Tags:graph partitioning, vertex partitioning, label propagation, distributed graph computing, graphx
PDF Full Text Request
Related items