Font Size: a A A

Research And Implementation On The Method For Scheduling MapReduce Job With The Consideration Of Performance Interference Among Virtual Machines

Posted on:2016-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:C W GaoFull Text:PDF
GTID:2348330512470920Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,the daily data created by the Internet dramatically increase.Currently,most of the enterprises manage the complex and enormous task by using distributed computation system to make the processing more efficient.Currently,MapReduce framework is widely used by numbers of organizations and departments.Meantime,the optimization of the MapReduce framework has become a hot issue and lots of scheduling methods have been proposed.At present,Most of the research works of MapReduce scheduling assumes that there is a perfect performance isolation in the execution of the jobs which means that the execution of one job will not affect the others.However,due to the general usage of virtualization technology in cloud computing,the performance interference among virtual machines cannot insure the absolute performance isolation among application deployed on the virtual machines.In virtual environment,it is difficult to guarantee the performance of the jobs during the execution.So,this thesis proposes a scheduling method with the consideration of performance interference.Firstly,to solve the problem of performance interference among jobs in the virtual environment,this thesis establishes a framework for evaluating the performance interference among virtual machines,in which a model for evaluating the performance interference degree is established.Meanwhile,as for the virtual machines with experienced interference data,this thesis proposes Model training algorithm based on BP neural network.For the virtual machines without experienced interference data,the thesis presents a algorithm for generating the model based on the similarity calculation.Based on these,the thesis presents a MapReduce job scheduling mechanism,which obtains the allocation resources and scheduling jobs for next interval by the algorithm of estimating the remaining execution time of the task based on the task process degree and the algorithm of estimating the scheduling interval based on the remaining execution time and computational algorithm for MapReduce job scheduling interval according to the remaining executive time.Then the thesis presents a mathematic model aiming at minimizing the overall performance interference.Next,a greedy algorithm is proposed to solve the problem.In this way,performance interference can be reduced and the effectiveness of job scheduling can be guaranteed.Taking all these into consideration,this thesis builds a distributed Hadoop cluster experimental environment and analyses the results through contrast experiments,which verifies the feasibility and effectiveness of MapReduce job scheduling method with the consideration of performance interference.
Keywords/Search Tags:cloud-computing, performance interference among virtual machines, MapReduce Task Scheduler, BP Neural Network
PDF Full Text Request
Related items