Font Size: a A A

The Research Of Graph Algorithm Based On Cloud Computing

Posted on:2012-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y H DuFull Text:PDF
GTID:2178330335960178Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The development of network techniques brings people in a great deal of information, and it also greatly increases the difficulty to find useful knowledge from mass data. The efforts to solve this problem promote the emergence and rapid development of data mining techniques. At present, the data mining technologies and tools have been actually implemented in various scenarios, such as physics, biology, politics, economics, world wild web, engineering and social life. Graph mining is an important part of data mining, and it will be more intuitive and more convenient to discovery the information hidden in mass data. The continued exponential growth in both the volume and the complexity of information is giving birth to a new challenge to the specific requirements of analysts, researchers and intelligence providers, With respect to this challenge, a new class techniques and computing platforms, such as Hadoop platform, which mainly focus on scalability and parallelism, has been emerging.This article aims to do some researches on graph algorithms based on cloud computing, and design and implement three basic graph algorithms. These algorithms include connected component algorithm for undirected graph, strong connected component algorithm for directed graph and Betweenness Centrality algorithm for undirected graph.First of all, design appropriate data structure according to the characteristics ofach algorithm. Algorithm performance with reasonable data structure algorithm can be greatly improved.After researching on cloud computing and graph algorithm, design and implement connected component algorithm for undirected graph, strong connected component algorithm for directed graph and Betweenness Centrality algorithm for undirected graph. These three algorithms are all based on cloud computing platform. The connected component algorithm for undirected graph is based on the label propagation algorithm (LPA), and it also highlights the definition of small connected component and solving methods. The strong connected component algorithm for directed graph is using label and color as tag of node. And the Betweenness Centrality algorithm for undirected graph is implemented based on the idea proposed by Ulrik Brands.Then, the author verified the correctness of the above three algorithms by doing lots of experiments, and compares the performance with the traditional algorithm. Experimental results show that the proposed algorithms are more effective to response to large-scale data.Finally, the author does some researches on Twister, and makes a comparison of Twister and MapReduce.
Keywords/Search Tags:graph mining, cloud computing, connected component, betweenness centrality
PDF Full Text Request
Related items