Font Size: a A A

A Study Of Hadoop Cluster Performance Optimization And Average Life Based On Markov Process

Posted on:2017-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:J X LiFull Text:PDF
GTID:2348330488472009Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Hadoop cluster as the cloud computing and big data technology is used in many kinds of fields,the value whether in commercial and scientific is significant.High-powered Hadoop cluster can be made up of numerous servers.But,after a long task execution on Hadoop cluster,the cluster would cause some faults,such as node failure,network congestion,task exception and task execution time too long.Cluster faults would hurt Hadoop cluster on different strength,which became the main reason of cluster faults frequently,resource utilization unreasonable and cluster performance degradation,more serious would cause the decrease of cluster average life.From the point of decreasing the cluster faults,this paper proposes an optimization model of Hadoop cluster performance prediction based on Markov stochastic process,called Prediction and Optimization Hadoop(POH)model.To the question of Hadoop cluster node happened fault frequently,we adopted the strategy which predicts the cluster faults state first then make the optimization and adjustment.The POH model collects nodes' valid information by prediction information repository,and makes use of Markov chain to forecast Hadoop cluster performance.According to the forecast results,the POH model optimized the cluster performance from the three aspects below,task parallellelism,HDFS data block muli-duplicate and Name Node backup cycle time.The experimental results show that Hadoop cluster is effectively optimized in task average read-write execution speed by our proposed POH model,as the same data size on each node,in which the cluster fault is reduced and the cluster efficiency is significantly increased.An excellent Hadoop cluster should have a longer cluster average life.Therefore,cluster average life can be an evaluation criterion of Hadoop cluster performance.To solve the problem,this paper proposes a method of Hadoop cluster average life prediction based on Poisson process.By prediction calculation,adjust and optimize the cluster timely to keep the cluster performance in a normal state.Then make several experiments on three different feature types of Hadoop cluster to observe the frequncy of cluster average life.According to the experiment consequence,this method is the significant to Hadoop cluster performance evaluation.The prediction results also can be the forceful evidence of Hadoop performance parameter adjustment deeply.
Keywords/Search Tags:Hadoop cluster, Markov process, prediction optimization model, cluster fault, average life
PDF Full Text Request
Related items