Research And Implementation Of Mapreduce Fault Tolerance Method Based On Intermediate Result Checkpoint

Posted on:2018-06-14

Degree:Master

Type:Thesis

Country:China

Candidate:K Ding

Full Text:PDF

GTID:2348330515455332

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of Internet,the amount of data generated by network increased explosively.The traditional storage and computing pattern cannot satisfy the requirements of applications for storage and computing.Cloud computing relies on its excellent distributed processing technology becomes the most popular data processing technology.Among them,MapReduce,an efficient parallel computing framework,has been applied in the field of large data processing widely.At present,there are two common failure types in MapReduce model:task failure and node failure.For task failure,MapReduce handles it by re-executing,that is to say,tasks can be re-allocated after a failure.But this will waste large amount of computing resources,extend average task completion time,and reduce the computational efficiency.Node failure is divided into Master node failure and Worker node failure.For Master node failure,MapReduce adopts duplex fault tolerance method.For Worker node failures,failures not only can cause the loss of intermediate results which are placed on Worker node and generated by Map task,but also can lead to the re-assign and re-execution of tasks.And,currently,there are no ideas to deal with this fault type of MapReduce model.This thesis mainly completes the following three aspects of work.(1)Analyze the shortcomings of MapReduce fault tolerance mechanism of Hadoop source code:by analyzing Hadoop source code,we study the way of handling task failure and node failure and their shortcomings.This provides us a basis for improving fault tolerance methods of MapReduce.(2)Design and implementation checkpointing fault-tolerant mechanism:for task failures and node failures of the computing process of MapReduce,this paper designs and implements checkpointing fault tolerance mechanism.Saving status information and intermediate results of task execution in the form of checkpoint file,and when tasks was re-assigned,we use the saved information to implement task recovery quickly.For task failure,we design and implement Local Checkpointing fault tolerance mechanism,Remote and Query Metadata checkpointing fault tolerance mechanism for node failures.(3)Test and execution of checkpointing fault-tolerance mechanism:After the design and implementation of checkpointing fault-tolerance mechanism,we build a Hadoop cluster,code application and inject faults to the application to verify whether our checkpointing fault tolerance mechanism can provide fault tolerant effectively or not.And to verify the efficiency of our proposed checkpointing fault tolerance mechanism.

Keywords/Search Tags:

checkpointing fault-tolerance, intermediate results, Hadoop, MapReduce, cloud computing

PDF Full Text Request

Related items

1	The Desgin And Implementation Of A MAPREDUCE Based Distribute Programming Framework
2	Fault Tolerance For MapReduce In The Cloud Environment
3	Research On Improving The Fault Tolerance Performance In MapReduce
4	Design And Implementation Of The Platform For Evaluating Fault Tolerance In Hadoop
5	Research And Optimization Of Mapreduce Fault Tolerance Mechanisms
6	Research On Adaption Method Of Cloud Fault Tolerance Services Based On User Requirement And Resource Constriction
7	Optimization Techniques Of Proactive Fault Tolerance For Large-scale High Performance Computing Systems
8	Researches About Cloud Computing And Expolit And Test Hadoop Program
9	Study And Implementation Of Fault Tolerance For Heterogeneous Parallel Computer
10	Research On Fault-Tolerant Checkpointing Algorithm And In Software Design