Font Size: a A A

Workflow Scheduling With Privacy Protection In Hybrid Cloud Environment

Posted on:2021-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:X LiFull Text:PDF
GTID:2518306557987369Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Privacy protection is an important issue in workflow scheduling.Deadline and task privacy constraints for Spark application in big data processing framework is under consideration in this thesis.The problem of scheduling Spark application in hybrid cloud environments with the objective of minimizing rental cost is studied.The problem consists of two challenges:(i)The privacy tasks in Spark application make it difficult to achieve a balance between completion time and rental cost.(ii)The two-layer precedence constrained order in Spark application forms a large number of stage topological orders.Determining an appropriate stage scheduling sequence to minimize rental cost is NP-hard.For the problem under study,the characteristics of the problem is analyzed and the corresponding mathematical model is built.A new architecture based on the existing Spark scheduling architecture is proposed.Spark Scheduling with Privacy Protect in Hybrid Cloud algorithm(SSPPH)is presented.The algorithm includes four parts: stage sub-deadline division,stage scheduling sequence generation,task scheduling,scheduling result adjustment.Three different stage sub-deadline division rules are introduced to assign sub-deadline for each stage based on estimated temporal parameters,including Execution Time Based Sub-deadline,Level Based Sub-deadline,Crucial Path Based Sub-deadline rules.In the stage sequence generation part,three different scheduling sequencing rules are presented,consisting Maximum Rank First,Minimum Float Time First,Task Based Priority Rule.According to the privacy of tasks,privacy task scheduling algorithm and non-privacy task scheduling algorithm are designed.Considering VMs' utilization in private cloud,First Usable First,Earliest Finishing First and Smallest Waste First rules are proposed to fully use private VMs.Considering the objective of minimizing rental cost,Minimum Left Slot First,First Available First and Maximum Left Slot First strategies are proposed to take advantage of idle slots in rented VMs.Scheduling result adjustment is presented to reduce the number of rented VMs.To evaluate the performance of the proposed algorithm,the multi-factor analysis of variance(ANOVA)technique is adopted to calibrate the components of the algorithm.SSPPH is analyzed and compared with two cost optimization algorithms.Experimental results indicate that the proposed algorithm outperforms the compared algorithms under different deadlines,application scales and private VMs numbers.
Keywords/Search Tags:Spark, Privacy Task, Cost Optimization, Hybrid Cloud
PDF Full Text Request
Related items