Font Size: a A A

Researches On Task Scheduling Technology In A Cloud Computing Platform-oriented

Posted on:2012-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:L Y LiFull Text:PDF
GTID:2248330395485672Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Cloud Computer in the recent years, the particularCloud platform hadoop which mainly processes the massive date-intensively job iscommercial used in large IT companies, social networking and mobile telecomsoperators. Hadoop system can process massive date efficiently by parallel and bedeployed in ordinary server. A large Hadoop cluster includes hundreds of nodes, it isvery important to adopt which kind of scheduling technique to schedule those nodesin the cluster. A good task scheduling strategy can not only greatly improve the taskresponse time and the system throughput, but also can enhance the system resourceutilization. The research on task scheduling technology about Cloud Computerplatform is of great significance.The paper mainly expounds the following studies for particular Cloud Computingplatform Hadoop scheduling techniques:Firstly, it is to study the basic framework of Hadoop platform. Hadoop mainly isconstructed by the date storage structure HDFS and the task parallel processing modelMapReduce. Based on the Hadoop framework, explains the task processing flow, andanalyzes the data storage characteristics and date flow. The development of thescheduling techniques in Hadoop platform is introduced. At the same time, it is toanalyze the properties and limitations of existing scheduling algorithms of Hadoopplatform.Secondly, according to the data storaged property of Hadoop, puts forward aimproved algorithm of LATE based on data-locally. In the Hadoop platform,considered backup speculation execution scheduling by nodes and racks, the priorityis to select the task where the data stored in the request processing node or the local-rack to speculate execution; if there is no date dependence in locally-node/rack, it willconsider to distribute the speculation execution task in the other rack.Thirdly, use statistical probability to solve the problem of the task waiting toolong to influence the response time. According to the Law of Rare Events, weigh thegain and loss for the data-locally optimization problem on scheduling technology.Finally, it is to simulate Hadoop architecture on the CloudSim simulationplatform, set different types of job and parameters value in simulation experiment forcomprehensively analyzing the algorithm performance. Compared with other algorithm, the improved algorithm has obvious advantages in task localizationprocessing ability, reducing job response time and optimizing system throughputaspects. Simulation results show that the improved algorithm based on data-locallysolves the performance bottleneck of Hadoop platform scheduling technology aboutdata-locally.
Keywords/Search Tags:Cloud Computer, Hadoop, Data-locally, Speculation Execution, CloudSim
PDF Full Text Request
Related items