Font Size: a A A

Research On Data Placement And Task Scheduling Algorithm

Posted on:2014-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:Q WangFull Text:PDF
GTID:2248330395497859Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of information technology and the Internet, data-intensivecomputing is becoming more common. Cloud computing with its low price,high-performance computing for interactive computing and easy to use resources tothe rapid development of the industry has also become a hot research topic, especiallywith regard to resource management and task scheduling. In the past, for a copy of thedata management, mainly biased in favor of security, while ignoring a copy of itselfcan be accessed and used to increase the data locality, the statically defined number ofcopies. In data transmission, without considering cloud computing network bandwidthbetween the nodes in the system, which may increase the cost of data transmission.Load balancing, because there is no departure from the task itself, caused by somealgorithm is very close to the optimal solution, but the time complexity is too large orthe algorithm is simple, but the poor load balancing. In addition, to improve theoverall performance of the cloud computing system, not only to consider data localityor load balancing, but also it needs to balance all the various aspects of the node’s datatransmission time, task execution time and task queuing time.To solve the above problem, we propose an algorithm framework, based onaccess frequency, data size and storage space dynamically adjust the number of copiesof data. It is based on the principle of minimum travel time, using data correlation,node dependence, network bandwidth, and system information, using an iterative loopscheduling search smaller the transmission time of data is placed and task schedulingprogram. It can also be based on the idea of load target drive, fine to assign tasks,balancing each node load. It can also be based on improve the system throughput thebalance task queuing time and data transfer time and higher system throughput.This article focused on data placement and task scheduling, the main contentsinclude the following aspects:1, In the first chapter, the paper first a brief background and issues of thedevelopment of information technology on the inevitability of cloud computing andthe future trends, as well as the importance of data placement and task scheduling. Wethen analyze the results of the work of the existing data placement and task scheduling, as well as the problems and shortcomings of past research. Finally, introduce theorganizational structure of the main work and the full text of this article.2, In the second chapter, we first define and explain cloud computing, anddescribe in detail the overall architecture level and between the layers and the role ofcloud computing. Then briefly introduce cloud computing technology related to thisarticle, we can to understand the background data placement and task schedulingcloud computing environment better. Finally, we described the specific environmentof cloud computing, including cloud computing model, file read and read details ofthe operation and data processing and task scheduling model.3, In the third chapter, first we explore its number of copies of the impact of thedata itself, the size and frequency of visits, and introduced the dynamic adjustment ofthe number of copies of data calculation formula. We then compared the number ofdata transmission, data transmission size and data transmission time, taking intoaccount the data itself size, the network bandwidth, storage space and other factors.We determined the data transmission time as a measure of the standard, and thelaunch of its formula. Secondly, the objective load driving method proposed in thispaper, that compute nodes node performance and assignments, the goal amount oftasks, then come to the deviation of the load balancing, so adjust and measure loadbalancing of the system. In addition, this article proposed the throughput of thesystem to measure the overall performance of the cloud computing system and thelaunch of its formula. Carefully explore the above factors, we propose an algorithmframe that according to different objective, used different calculation function, basedprinciple loop iteration and constantly slack, the search for a more reasonable dataplacement algorithm framework for task scheduling program. The experiments showthat this algorithm searches data placement and task scheduling program to achievebetter results.
Keywords/Search Tags:Cloud Computing, Data Placement, Task Scheduling, Data Copying, DataTransmission, Load Balancing, Throughput
PDF Full Text Request
Related items