Font Size: a A A

Design And Implementation Of The Failure Recovery Mechanism In MapReduce

Posted on:2013-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:R GuoFull Text:PDF
GTID:2248330392457702Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the continuous development of large-scale data processing, the scaleof clusters increases heavily, and the requirements of System reliability havebecome more sophisticated. However, for such a large cluster, there are allkinds of inevitable failures. In the process period of MapReduce, the taskfailures and the node failures are even more general, but there are somedefects in the failure recovery mechanism of MapReduce. Therefore, it is ofgreat significant to study and optimize the failure recovery mechanism ofMapReduce.This paper firstly describes the concept, characteristics and developmentstatus of cloud computing, and outlines the characteristics of a Hadoopcluster. On this basis, indicating the significance of study the recoverymechanism of large-scale cluster and the research situation world wide.Then, this paper do a detail description of MapReduce, specificallyaddressed the basic idea, working principle and task scheduling process of MapReduce. On this basis, the paper describes the main failure type ofMapReduce, and in-depth analysis the failure mechanism of each type.Then, based on the original MapReduce, this paper does someoptimization. By add a function module that can auto-restart the failure node,a node can be fast recovery from a node failure. By optimizing the failuremechanism, a task can be fast recovery from a task failure, need not tore-process from the beginning of the task. Through the relevant optimization,clusters can achieve fast recovery from failure.Finally, this paper do a test and evaluation of the optimized system inboth functional and performance. The result shows that the optimized systemachieves the desired purpose in function, and the performance is better thanthe original MapReduce.
Keywords/Search Tags:Cloud-Computing, MapReduce, failure recovery, task schedule, performance evaluation
PDF Full Text Request
Related items