Font Size: a A A

Research On Decentralized MapReduce Of Node Failure And Fault Tolerance Mechanism

Posted on:2016-09-05Degree:MasterType:Thesis
Country:ChinaCandidate:P XiaoFull Text:PDF
GTID:2308330470455461Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
MapReduce is a parallel programming model for processing massive data, and deployed on a large of cluster to distributed processing. In Hadoop, the most used open source framework which implements this model, there are node failure, specially two failure of single point, that can compromise the process of a job. Two failure of single point which are the master of the job and the NameNode of DFS, influence the execution of the job and the performance of the system, for example, the fault crash of the NameNode’s DFS can lead to all complete computing inaccurately, once the master node fails, the entire system may be paralyzed. For the processing of the MapReduce’s node failure at home and abroad, we have proposed to use backup node, modifying DFS architecture, decentralized architecture and the use of a specific services, however, these mechanisms exist the following problem,(1)without effective use of the backup node’s computing resources,(2)in modifying the DFS architecture, it achieve80%of the traditional MapReduce’s performance,(3)the number of current decentralized MapReduce’s node is limited, or it also use backup node,(4)a specific service can be applied in a specific environment.In order to solve the problem above, the thesis study the way of the current MapReduce which handle the node failure, from the characteristics of traditional MapReduce architecture, programming model, workflow, and fault tolerance mechanism, combined with the characteristics of the P2P network, proposed a kind of decentralized solution. In this thesis, the content of the research mainly included the following aspects:1, By studying current MapReduce of the characteristics, workflow and fault tolerant mechanism. Analyzing the way of dealing with fault, the problem and shortcoming of the MapReduce under the node failure, byzantine fault and the failure of single point.2, In face of the node failure of the MapReduce and its problem at the present, combing with the characteristics of P2P network and the advantage of choosing Chord protocol as a decentralized way, we proposed a mechanism based on P2P MapReduce. We introduced the architecture of P2P MapReduce, workflow and the way of dealing with fault.3, According to the mechanism which this thesis proposed, we make an implementation and evaluate in the certain sets of test data. In a typical WordCount application, we make an experiment to validate the mechanism this thesis proposed which can avoid the failure of single point. After the fault crash, the system can recover at the right time, and improving the performance of the system and reducing replication overhead. Therefore, this thesis puts forward a P2P MapReduce system which is feasible, and it can process a certain size of large data.
Keywords/Search Tags:MapReduce, P2P, Node Failure, Fault Tolerance
PDF Full Text Request
Related items