Font Size: a A A

The Mapreduce Model In The Hadoop Implementation Of Performance Analysis And Optimization Improvements

Posted on:2011-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:M M ZhangFull Text:PDF
GTID:2208360308466807Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
The presentation of Cloud Computing is a great impact to the Internet-mode Service. The Concept of Clout Computing means that we can treat computing as a commodity. It provides a simple and transparency programming model for internet users and developers to develop and archive I-mode Service. As 2009 becomes the first year of Cloud Computing in IT industry, business giant such as Amazon, IBM and Google has taken Cloud Computing the most important strategic direction in their near further.As a new technology that runs on large clusters for dealing with massive data storage and computing, it is very important to find a way for organizing these large number of servers to guarantee high performance of Cloud Computing system. Meanwhile MapReduce shows its simply model for parallel data computing and task scheduling to be a suitable solution for these requirements. With MapReduce simple business logics are separated from the complex implementations. Yet how to schedule tasks efficiently and fairly in MapReduce model has become one of the most popular topic in MapReduce community.Hadoop is a open source frame that implement Google's MapReduce model and is the most popular open source software for Cloud Computing. But it's still a young project, and there are a lot of points to be improved.In this thesis we do a in-depth research on MapReduce which is the core technology of Cloud Computing. We test MapReduce's performance with its typical applications on fairness, scalability, speedup and response time. According to the result of our experiments we determine the inadequate of the scheduling in Hadoop platform and propose a new scheduling algorithms. With this new scheduling algorithms, Hadoop can have a better performance in a Heterogeneous Environments which Hadoop always runs on in practice. And finally we do some experiments and comparison to verify the advantage and usability our scheduling algorithms.At last of this thesis we conclude our work and discuss the possible orientation of MapReduce model.
Keywords/Search Tags:Cloud Computing, MapReduce, Hadoop, Scheduling algorithm
PDF Full Text Request
Related items