Research On Hadoop Fault-Tolerant Scheduling Technologe

Posted on:2017-05-07

Degree:Master

Type:Thesis

Country:China

Candidate:G D Guan

Full Text:PDF

GTID:2308330485484413

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

As Big data era is coming, both cloud industry and academia are interest in improving the fault-tolerant performance of Hadoop. With the increasing needs of all works of cloud computing technology, the needs of real-time jobs of cloud computing technology should not be ignored. It’s important to enhance real-time performance of jobs on Hadoop under the premise of fault-tolerant.Runtime perdiction of MapReduce jobs is very important for performance optimization of Hadoop, so that scheduling optimization can arrange the order of running jobs reasonably or allocate the resource for jobs reasonably. However, the previous methods for predicting runtime of MapReduce jobs are not perfect. There are many aspects for improvemant. Thus, we proposed a new method to predict runtime of MapReduce jobs, based on the processed data and the history of job information, and we implemented the method on Hadoop. The experimental results show that the proposed method can predict the runtime of MapReduce jobs perfectly, with 4.2% average error.At the same time, the heartbeat mechanism in Hadoop is not reasonable for short jobs, ignoring the fairness of expired time set of nodes in heterogeneous cluster. In order to overcome the problem, a fair expired time fault-tolerant mechanism was proposed. First of all, a loss model and a Fair MisJudgment Loss (FMJL) algorithm were put forward according to reliability and computational performance of nodes and predicted runtime of a MapReduce job, so as to meet requirements of the long jobs and short jobs at the same time. Then a fair expired time mechanism based on FMJL algorithm was designed and implemented. Running a 345 seconds’short job on the Hadoop with the proposed fair expired time mechanism, the results showed that it saved completion time by 44% when there was fault on a Tasktracker node, and saved completion time by 23% compared with self-adaptation expired time mechanism. The experimental results show that the proposed fair expired time mechanism shortens the fault-tolerant processing time without affecting the completion time of long jobs, and can improve the efficiency of real-time processing ability for a heterogeneous Hadoop cluster.Finally, in order to configure Hadoop or change the scheduler and fault-tolerant mechanism of Hadoop earsily, this thesis designed and implemented a cloud platform scheduling management system, which provide a graphical interface for operating and managing the Hadoop cluster. According to actual needs, the user can select a performance optimization scheme, including a variety of scheduler and the proposed fair expired time mechanism. At the same time, based on the proposed predicting job runtime mechanism, a more accurate forecasting method of job running progress has been implemented. Thus, the user can monitor and manage the running jobs easily.

Keywords/Search Tags:

cloud computing, time prediction, heartbeat mechanism, fault-tolerant, heterogeneous cluster

PDF Full Text Request

Related items

1	Research On Fault-tolerant Mechanism For SSI Cluster
2	Design And Implementation Of Cluster Fault-tolerant System
3	Research On Container Migration Mechanisms For User Level Fault Tolerance
4	Fault-tolerant Software Design And Implementation Based On Fault-tolerant Computer System
5	Research On High Availability Of Cloud Computing For Video Surveillance Analysis
6	Research On Fault-tolerant Scheduling Algorithm For Real-Time Tasks In Cloud Computing
7	Checkpoint-based Runtime Dynamic Fault Tolerant In Heterogeneous System
8	The Analysis And Design Of Cyber-Physical Systems Based On Cloud Computing
9	Research And Design Of Fault Injectors For Virtual Machine In Cloud Computing
10	Design And Implementation Of Multi-machine Fault-tolerant System On Linux