Font Size: a A A

Topology Design And Hadoop Research In Cloud Computing

Posted on:2010-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z L DengFull Text:PDF
GTID:2178360302459857Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
As a new concept proposed in the end of 2007, cloud computing is a great revolution in the IT field. It means that computing can be used as a commodity, likes coal gas or electricity. The only difference between them is that cloud computing transports by the computer networks. Up to now,Google,Microsoft,IBM,Amazon and some other famous cooperation has proposed their cloud computing application, and take cloud computing as one of the most important stratage in the future.In the back end of could computing system, there are thousands of servers. How to organize these servers is an important problem to guarantee high performance of the cloud computing system. A reasonable network topology can not only improve the performance of networking, but also enhance the stability of the system and make it works when some nodes or links failed. The could computing system's network topology is different from the typical internet, so it need to be reconsidered.Hadoop is an open source code frame for very large data process application, which runs on a cluster of commercial PCs. It create pallrall program through Google's MapReduce method and areadly been used by many famous IT company.Hadoop can be taken as the most popular open source could computing software, but there are many points to be improved since it is a young program. Based on the analyse above, this thesis research on the follow two issues of cloud computing:1. We surveyed the requirement of cloud computing's networking; analyzed the architecture of cloud computing's networking topology, and thought that the networking topology should be composing of two parts: the center switch trunk and some tree shape branch subnet. Then we proposed the Reversed Greedy Algorithm (RGA) for the center switch trunk's topology design, based on the graph theory, optimization theory and classic greedy algorithm. Finally we do some experiments and comparison to verify the algorithm's usability and advantages.2. We survey a lot of related resources and established a Hadoop platform with 8 PCs in our lab. Then we proposed a Priority Based Weighted Round Robin (PBWRR) algorithm for the MapReduce's task schedule. After that, we program this algorithm and apply it on the Hadoop platform. Finally we compare the simulation results of PBWRR and the original FIFO algorithm, discuss their virtues, flaws and suitable scenarios.In the end of this thesis, we conclude our work and discuss the plan in the future.
Keywords/Search Tags:cloud computing, topology design, greedy algorithm, Hadoop, MapReduce, task schedule
PDF Full Text Request
Related items