Font Size: a A A

Research Of The Method For Solving The Maximum Clique Problem Based On The Graph Partitioning On Big Data Platform

Posted on:2016-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:J YinFull Text:PDF
GTID:2310330536486774Subject:Engineering
Abstract/Summary:PDF Full Text Request
Maximum clique problem(MCP)is the classical combinatorial optimization problem in graph theory,belonging to NP-hard problem,which is about searching the largest complete sub-graph from an undirected graph.The maximum clique problem has widely ranges of applications,such as pattern recognition,artificial intelligence,sub-graph isomorphism and so on.Searching the maximum clique in social network,it helps to mine user behavioral characteristics and community structure in social network.Therefore,it has great influence on exploring the maximum clique problem theoretically and practically.With the further study and the exponential growth of graph scale,the deterministic search algorithm has time-consuming and low efficiency.The heuristic algorithm becomes main algorithm and makes some achievements in solving maximum clique problem,however,the poor versatility and low efficiency are still existed.When figure nodes have big growth and analyses become complicated,it has put forward higher requirements in terms of speed and accuracy studies,which traditional heuristic algorithm has been unable to meet the research needs of large-scale graph data for solving the maximum clique problem.Therefore,this paper proposes a parallel graph partitioning framework based on Hadoop PPMC for solving the maximum clique in large scale graph.This paper has the following three parts in the main research work and innovation:1)It has improved the breakout local search algorithm BLS and increased the penalty BLS sub-algorithm based on the penalty value,and proposed a new phased PBLS algorithm for solving the maximum clique problem.The PBLS algorithm has effectively improved the versatility and efficiency for solving.2)This paper proposes a parallel graph partitioning algorithm PGP to partition the search space in large-scale graph.While maintaining the original group structure remains unchanged,it greatly reduces sub-graph scale by sorting the degree of node,and ensures sub-graph scale within the appropriate range for balancing the workload when solving the maximum clique problems.3)In this paper,a parallel graph partitioning framework based on Hadoop PPMC is proposed by combining the parallel graph partitioning algorithm PGP and the phased PBLS algorithm.Firstly,MapReduce programming model is used in dealing with the graph text.Secondly,parallel graph partitioning algorithm PGP is used to divide in the search space in large-scale graph.Then the phased PBLS algorithm is selected to solve the maximum clique of each sub-graph.Finally,the maximum clique of the whole graph is ultimately obtained.In this paper,the parallel graph partitioning framework based on Hadoop PPMC has deployed on Hadoop distributed cloud computing platform.It has tested on Stanford large scale datasets.The experimental results show that the PPMC framework can effectively solve the maximum clique problem of large-scale graphs,and can further improve the versatility and scalability of algorithm while reducing the algorithm's search time and space complexity.
Keywords/Search Tags:The Maximum Clique, PBLS, PGP, Hadoop, PPMC
PDF Full Text Request
Related items