Font Size: a A A

Research On Data Placement And Fault-Tolerant Scheduling For Applications Of Data Stream In Geo-distributed Clouds

Posted on:2020-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:X H HuangFull Text:PDF
GTID:2428330623967022Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of sensor networks,the applications of data flow such as environmental monitoring are increasingly relying on cloud services.Today many cloud services are deployed on geographically distributed infrastructure.Cloud data centers located in different regions for better performance.Job processing in a Geodistributed cloud requires that all data has been transferred to the cloud data center where the job resides.Longer time to place data will result in increasing job latency,which increases the response time of the application of data stream.After the data placement in the Geo-distributed Clouds,jobs are scheduled to be processed in each cloud data center.During the execution of the job in the Cloud,spending long time on the stragglers may result in delaying the execution of the entire job.How to effectively place data in the Geo-distributed Clouds and process the stragglers during the execution of the job becomes an urgent problem to be solved.Therefore,it is of great theoretical and practical significance to research the data placement and fault-tolerant scheduling methods for data stream applications in Geo-distributed Clouds.In view of the above application scenarios and problems,this thesis mainly includes the following aspects,(1)In order to effectively reduce the data transmission time and data transmission bandwidth cost during data placement,while maintaining capacity limits for each cloud data centers and the load balancing of Geo-distributed Clouds,this thesis designs a data placement algorithm based on Lagrangian relaxation in Geo-distributed Clouds.The algorithm firstly mathematically models the data placement problem in the Geodistributed Clouds under the three constraints of cost,capacity and load balancing.Then the data transmission cost problem in the Geo-distributed Clouds system is transformed into the multi-source shortest path problem with directed weighted graph.The Floyd algorithm is used to solve the minimum transmission bandwidth cost.Finally,the data placement objective function is transformed from a complex integer programming problem to a linear programming problem,and the Lagrangian relaxation method is used to solve the data placement scheme with the smallest transmission time.The goal of reducing data transmission time and maintaining system load balancing is achieved.(2)In order to improve the efficiency of the job execution in the cloud,this thesis proposes a fault-tolerant scheduling method based on speculative execution.The method includes two parts,which are the task replicas creation method and the task scheduling algorithm.For the task replicas creation method,according to the state of the cluster,there are two strategies for task replicas creation,which are the task replicas creation method based on task cloning or straggler detection.Task replicas creation method based on task cloning creates a replica of the task based on the job deadline and cluster resource status before the task is executed,and then all replicas are executed at the same time.The task replicas creation method based on the straggler detection determines whether the task is a straggler according to the remaining execution time of the task and the benefit of creating replicas after the task starts to execute,and then the replica of the straggler is executed on other nodes.Combining the above speculative execution model with the FAIR scheduling algorithm in the Spark platform,a faulttolerant scheduling algorithm based on speculative execution is constructed.This algorithm achieves the goal of shortening the completion time of the job,improving the cluster throughput and QoS satisfaction rate.(3)The performance of the proposed algorithm is verified by comparing it with the existing algorithms.In the experiment of data placement algorithm based on Lagrangian relaxation,the experimental results show that this algorithm can reduce the data transmission time and the data transmission bandwidth cost.Meanwhile,the load balancing degree of this algorithm is improved.In the experiment of fault-tolerant scheduling algorithm based on speculative execution,the experimental results show that the average application completion time of this algorithm is reduced.And this algorithm can also improve the cluster throughput and the QoS satisfaction rate.
Keywords/Search Tags:Geo-distributed clouds, Applications of data flow, Spark cluster, Lagrangian relaxation, Speculative execution
PDF Full Text Request
Related items