Font Size: a A A

A Value-oriented Heuristic Algorithm For Scheduling Parallel Applications In Cloud Computing Environment

Posted on:2019-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:Q S ShaoFull Text:PDF
GTID:2428330545953679Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Cloud data analytics service platform delivers a new type of public cloud offerings,through which end users can outsource their jobs by using a group of professional cloud data processing services in a pay-per-use way.Different from other type of cloud services,the parallel jobs dominate the domain of Cloud computing and Big Data processing.The execution time of parallel job can vary greatly with different runtime configurations,such as different degrees of parallelism.In such a market-oriented environment,scheduling jobs from end users efficiently to optimize the Spark-based cloud data analytics service platform's revenue is a novel and more challenging research.Due to the high complexity lying in the structures of parallel application models and the heterogeneity of resources computing capacities with a scalable computing environment such as a cloud computing platform,performance analysis and prediction for parallel applications under different running configurations become increasingly difficult for parallel applications.In this paper,we use job profiles in real world to solve the two problems above by an experiment-driven method.First,as the basis of effective scheduling,we present a collaborative filtering based Spark parallel application performance prediction algorithm which can quickly and accurately predict the execution time of the parallel applications running in a cloud data analytics service platform.Furthermore,we propose a value-oriented heuristic algorithm for scheduling parallel jobs with admission control to optimize the platform operator's revenue in a Spark-based cloud data analytics service platform by improving the FirstReward.The proposed scheduling heuristic takes into account not only the dynamic revenue gained from accomplishing a job within a specific runtime as well as the consumption of resources needed for running it to achieve this given runtime,but also the potential loss it causes to the system by running this job instead of other waiting jobs currently in the system.We have conducted extensive experiments and simulations based on workload data derived from the real-world data analytics service platform and parallel applications.The results indicate that our performance prediction approach gets a extremely low error which benefits from the history profiles and feedback correction.Besides,the experiments show that our scheduler can outperform the other scheduling algorithms used for comparison based on classical heuristics from literature,thereby fully evaluating the effectiveness of our market-oriented heuristic scheduling algorithm.So our scheduler can effectively improve the profits of service providers who supply the Big Data analytics services based on Spark.
Keywords/Search Tags:value-oriented, big data analytics, cloud service platform, heuristic job scheduling
PDF Full Text Request
Related items