Fault Tolerance For MapReduce In The Cloud Environment

Posted on:2013-02-16

Degree:Master

Type:Thesis

Country:China

Candidate:H Zhu

Full Text:PDF

GTID:2218330362959403

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Cloud Computing has become one of the most important technologies in today'scomputer industry. Along with the rapid development of the Cloud technologies, theform of data transforms from the traditional structured data to semi-structured dataand unstructured data; at the same time, the data size explodes. Traditional databasetechnology can no longer be able to resolve this large volume of data. Therefore, howto handle this Big Data has become a pressing problem. Fortunately, in 2004, Googlepresentedtheirsolution,MapReduce,todealwiththechallengescreatedbythemassivedata sets of the cloud era.In short, MapReduce is a ?exible and highly available architecture for large scalecomputation and data processing on a network of commodity hardware. It is not onlyto handle massive amounts of data to solve performance problems, but also simpli?esthe programmer's way to develop distributed parallel programs. More importantly,MapReduce solves the scalability and reliability issues, which are MapReduce biggestadvantages compared with the traditional database. Around the emerging program-ming framework MapReduce, a variety of research has been launched at domestic andabroad, in which the fault tolerance of MapReduce has been one of the hottest spotsin the this area. Domestic and abroad study for fault tolerance can be summarized asthe following two directions: re-execution and backup. All these studies are attempt-ing to improve recovery mechanisms which can be e?ective when the failure has beendetected and located. However, if the cluster cannot be aware of the failure, the abovework cannot get the expected performance then. Therefore, this thesis will study thefault tolerance of the MapReduce from a new perspective, that is, how to fast and moreaccurately percept the failure node in the MapReduce cluster. Regarding this problem, this thesis attempts to propose two ideas: adaptive expirytime and reputation-based detection model. The adaptive expiry time aims to adaptive-ly change the rigid and ?xed MapReduce cluster expiry time. To do this, it will ?rstestimate the execution time for each job, and then let expiry time adaptive to the esti-mating execution time. At runtime, if JobTracker has not received heartbeat messagesfrom the node during the adaptive expiry time, then that node will be considered as afailure node. In addition, the reputation-based detection model will give each node areputation value, and reduce each reputation when meet the remote fetch failure fromreduce task to map task. If the reputation value of node decay to a lower limit due totoo many remote fetches failures, that node will be considered as a failed node.A large number of experimental data shows that the two proposed solutions aremuch better than the original Hadoop cluster. When the cluster meet a node failure,compared to the original Hadoop, this program can be found in signi?cant reductionin the time of detecting failure. In addition, it is demonstrated by the comparison ofexperiments, that the adaptive expiry time tends to short jobs while the reputation-based detection model is advantageous to the large jobs. Using these two solutionscan be e?ectively working with the existing fault tolerant technology, making Hadoopbetter fault tolerant, not only to quickly locate failures, but also to quickly recover backfrom failures. The main contribution of this thesis is not only limited to propose anadaptive expire time and reputation-based detection model, but to widen the researchideas for the fault tolerant Hadoop.

Keywords/Search Tags:

MapReduce, Hadoop, MassiveDataprocess, par-allel computing, adaptive

PDF Full Text Request

Related items

1	The Mapreduce Model In The Hadoop Implementation Of Performance Analysis And Optimization Improvements
2	Design Of Mapreduce Task Scheduling Algorithms In Heterogeneous Hadoop Cluster
3	The Research Of MapReduce Job Scheduling Algorithm Based On The Hadoop Platform
4	The Performance Optimization And Improvement Of MapReduce In Hadoop
5	Researches About Cloud Computing And Expolit And Test Hadoop Program
6	Optimization And Application Research Of MapReduce Computing Model Based On Hadoop
7	Design And Implementation Of Visual Data Platform Based On MapReduce
8	Research On Scheduling Algroithm In Hadoop Mapreduce
9	Extension Of Hadoop Framework And Performance Tuning
10	Research On Incremental Computing Technologies And Algorithms Based On MapReduce