Font Size: a A A

Research On Multiple Tolerance Of MapReduce Under Cloud Environment

Posted on:2015-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y G LiFull Text:PDF
GTID:2308330479951607Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development and popularization of Internet Technology, Information Technology has entered a big-data era in human society, the traditional computing model can not meet the demands of the times, an emerging business model-cloud computing emerged, Cloud Computing brings new opportunities and challenges for the computer industry. Cloud Computing is a large-scale distributed computing systems, it provides an abstract, virtual, dynamic adjustment and manageable computing power, storage, platform and service for external users on the Internet.MapReduce parallel programming model that Goolge company is proposed is the most typical case in Cloud computing, the feature of easy to us, high degree of parallelism, high reliability,etc attracted a lot of users. There are a lot of all aspects of the MapReduce programming model at home and abroad, the MapReduce fault tolerance mechanism has been one of the hot research, them can be summarized as two options: Backup and re-execution, but these two programs have a basic premise-- the perceived failure of the node, if the perceived failure is not timely or not accurate, these two programs will not play a role. This thesis will study on the fault how to find a faster and more accurate failure node problems from a new perspective, it divided into the following areas:Firstly, the design of fault tolerant solutions based on needs analysis simply be divided into three steps: multiple monitoring, application request and task migration, it analyzes their functions, studies multiple fault tolerant operation mechanism.Secondly, this thesis improves the traditional MapReduce framework, builds multiple architectures among TaskTracker nodes in the same cabinet and multiple heartbeat mechanism, increases multiple classes multipleTaskTracker and multipleJobTracker on TaskTracker and JobTracker, achieves multiple tolerant.Thirdly, supplement resource competition issues that may arise during task migration, according to the principle of fairness,it migrates task in accordance with their respective rules for the two types of resource competition.Finally, the performance of multiple tolerant is analyzed from the three aspects of the response time, speedup and the advantages of multiple tolerant, under the premise of no affect system scalability, multiple relationship improves the efficiency of the monitoring node failure, saves the response time of job, reduce the utilization of bandwidth, the congestion of network and the load of JobTracker node.A large number of experimental data show that multiple tolerant was significantly better than the traditional tolerant under the premise of no affect scalability of MapReduce system. when the presence of the failed node in the cluster, multiple tolerant can significantly short the time of failure was found and can solve the competition for resources phenomenon in the task migration process easily.
Keywords/Search Tags:cloud computing, MapReduce parallel programming model, multiple tolerance mechanism
PDF Full Text Request
Related items