Font Size: a A A

Optimization Study And Implementation Of The Node Fault Tolerance Technology On Hadoop Framework

Posted on:2018-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:C WangFull Text:PDF
GTID:2348330542488036Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Cloud computing technology has been widely used in all walks of life,with the rapid development of cloud computing technology,the scale of processing data is expanding,and the data types are diversified.Therefore,how to deal with these big data becomes an urgent problem,in which the MapReduce task computing framework is the most representative distributed computing framework.With the continuous expansion of the scale of the distributed system,the dependence of the components in the system is also complicated,which makes the probability of the failure of the system also increased.Therefore,the fault tolerance technology becomes an important research content in the distributed system,and the node fault detection mechanism is an important part of the fault tolerance technology.In the aspect of node failure detection mechanism,the factors such as operation,node status and network environment have an important impact on the performance of the test.So on the basis of considering the above factors this paper proposes two mechanisms:Self-adaptive heartbeat mechanism based on multiple factors and credibility value detection mechanism based on job factors.Use it makes a timely adjustment to the frequency of heartbeat in order to adapt to different system environment and a real-time evaluation of the status of the nodes in the system.In terms of the heartbeat mechanism,this paper presents a comprehensive evaluation model of multi factor heartbeat detection.In this model,the effects of network load and the CPU working state of node and the operation of node on the heartbeat detection process are also considered.On this basis,a self-adaptive heartbeat detection algorithm based on multi factor evaluation model is proposed,this algorithm can change the frequency of heartbeat with the network environment,the CPU occupancy rate of node and the job size,and the optimal scheme of the frequency of heartbeat is given by the combined factors.The effects of multiple factors on the self-adaptive adjustment of the frequency of heartbeat were verified by experiments.Based on the heartbeat mechanism,the paper puts forward the credibility value detection mechanism based on the job,this mechanism gives the nodes a value of credit.Attenuated or recover the value of credit according to the information of heartbeat and the type of job.When a node's value of credit is lower than the minimum threshold value,it is considered that the node failure occurs,when a node's value of credit is increased to a maximum threshold value,no longer increases the value of the node's credit.A large number of experimental data show that in the system to join this mechanism can be faster than the original system to find the fault node and re-allocation timely.The execution time of the job is shortened from the whole.Finally,through the practical application and experiments,it is proved that the Hadoop system which is optimized by the above method can detect the fault node more effectively than the original system and carry out the task reallocation.
Keywords/Search Tags:distributed systems, heartbeat detecting, multi factor, the frequency of heartbeat, credibility value
PDF Full Text Request
Related items