Font Size: a A A

Algorithmic Design And Implementation For Performance Optimization Of Big Data Workflows In Multi-cloud Environment

Posted on:2018-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:M ShiFull Text:PDF
GTID:2348330512999346Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of cloud computing,multi-cloud environment compromised of different types of cloud gain more opportunities of application.Using multiple clouds can fully utilize limited resource to complete large scale computing tasks.Besides,using multiple types of cloud environment to complete computing tasks can use the computing resources efficiently and reduce the cost.In many scientific applications,the acquisition of ultimate result needs the execution of a series of computation tasks.And there exists interdependence relationship between these tasks,which include data generation,data processing and data analysis.These task and the correlation between them constitute workflow.In many applications of big data,workflow technique has been an indispensable componment.In multi-cloud environment,because multiple cloud is usually distributed throughout large geographical area and linked by conventional network like Internet,the cost of bandwidth guarantee of the network between multiple clouds is reliaively high.However,inside the cloud computing center,the communication of different tasks is commonly through ways like shared storage or network file system,the cost of which is relatively low.In the running of big data workflow,the budget and time consuming of communications of different cloud centers constitute a major part of the total communication cost.The current cloud computing service is devided into three classes:Software as a Service(SaaS),Platform as a Service(PaaS),Infrastructure as a Service(IaaS).Infrastructure as a Service means virtualizing computing resource and providing virtual resources resources to users.It is the most suitable type of service for the execution of big data workflow.The virtual hardware resource provided by cloud computing environment can be used to executing the different tasks of a workflow,the same mechanism as traditional ways which workflow is executed on clusters.In practical applications,budget constraint and reliability constraint both are important constraints to the deployment of big data workflow.Budget constraint guarantee the workflow application can be completed with relatively low cost,and the reliability constraint deals with uncertainty and failure in real cloud environment.This thesis is aimed at workflow mapping problems in budget and reliability constraints in multi-cloud environment.The mathematical models of the problem is abstracted.The complexity of the problem is analyzed.The heuristic algorithm is designed and is compared with other algorithms through experiment.The complexity of the problem is proved to be NP-Complete,and the performance superiority of the proposed solution to previous solutions is proved through experiments.Main research of workflow mapping in this thesis is devided into following parts:The formulation of problems and complexity analysis.The mathematical models of workflow mapping problems of optimizing end-to-end delay with budget and reliability constraint are built in this thesis.The problems proposed is proved to be sub-problems of a NP-complete problem through complexity analysis,so the problems proposed is of NP-complete problems.Problem analysis and algorithm design.The heuristic algorithms RMCWM and EMCWM are proposed for the two proposed problems.These algorithms are devided into two stages:the assignment of virtual machine and the selection of physical machine and physical link.The reliability in this thesis is modeled by 3GG model.The feature of this model is analyzed in this thesis and the regular pattern of optimizing reliability is found.The regular pattern is integrated into the algorithm design process,and the experiment results is excellent.Simulation experiment and performance evaluation.The algorithm is tested by experiment in different scales of cloud and different numbers of cloud.The superiority of the proposed algorithm to previous algorithm is proved in different scales of cloud.
Keywords/Search Tags:workflow mapping, cloud computing, performance optimization, big data
PDF Full Text Request
Related items