Font Size: a A A

Research And Design For Improved Mapreduce Framework

Posted on:2012-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:T ChangFull Text:PDF
GTID:2178330335960741Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As the cloud computing has a rapid development, Hadoop an open source cloud computing platform has been adopted by domestic and foreign companies. Accordingly, as a sub-project of Hadoop and a distributed parallel processing framework, there are more and more applications based on MapReduce. Along with the breadth and diversity of the application, it exposes many places need to be improved.In this paper, we do some research on MapReduce framework as following:(1) The concept of parallel computing, distributed computing and cloud computing; their relationship with MapReduce. We could get the conclusion that MapReduce framework meet the three computing model. Introduce the processing workflow of the traditional MapReduce framework and Hadoop, to lay the foundation for improved scheme presented below(2) By analyzing specific applications and in-depth analysis the processes and related source code of the framework, summed up some problems that maybe affect the efficiency of applications in the process, such as data skew, reduce task imbalance and scheduling problems(3) Offer some improvements for the framework; implement the split function for the Intermediate result, start the new reduce task and the corresponding scheduling mechanism, provides the implement of design and the code(4) Through many different levels of test, shows that the improved framework does improve the efficiency of operations...
Keywords/Search Tags:parallel computing, cloud computing, MapReduce
PDF Full Text Request
Related items