Font Size: a A A

Parallel Algorithms Research Based On Hadoop And Hama

Posted on:2014-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:D W CaiFull Text:PDF
GTID:2248330395976042Subject:Physical Electronics
Abstract/Summary:PDF Full Text Request
With the development of internet, the data of the internet is growing with the explosive speed. How to store and analyze the data has become a new challenge. Traditionally, a server or a cluster of servers with high performance will be used to handle the problem. But this solution will be with poor scalability and very expensive. So many researchers have increasing turned to distributed solutions and cloud computing. Cloud computing is very good at mining big data.This thesis focuses on how to implement some algorithms based on Hadoop and Hama. Firstly, we talk about Hadoop and Hama cloud computing platform and elaborate MapReduce and Bsp programming model. Then we set up the Hadoop and Hama environment to implement and test some algorithms.The main point of this thesis is to elaborate how to implement some algorithms based on Hadoop and Hama. These algorithms include several data mining algorithms, graph algorithms, maxtrix multiplication on Hadoop and graph algorithms on Hama.Some algorithms using MapReduce programming model are iterative, such as K-means, single source shortest path etc. This thesis innovatively concludes the general steps to solve the iterative algorithms and also analyze the performance based on different data storage format. At the end, we implement three algorithms based on Hama.Hadoop and Hama have their own advantages. This thesis compares the efficiency of the connected components of undirected graph problem separately implemented based on Hadoop and Hama. Through the result of the experiments and theoretical analysis, we get the final conclusion.
Keywords/Search Tags:cloud computing, Hadoop, Hama, MapReduce, Bsp, data mining
PDF Full Text Request
Related items