Font Size: a A A

Research On Key Technologies Of Reliability Optimization For Cloud Applications

Posted on:2020-07-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:N WuFull Text:PDF
GTID:1368330614950617Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Cloud computing has become one of the mainstream directions of the information technology research and application.The pay-as-you-go,high scalability and low maintenance cost natures of cloud computing have attracted more and more enterprises to deploy their applications and services in the cloud environment.However,the cloud platforms are usually large-scale and consist of a large number of distributed nodes.The highly dynamic nature of cloud environment makes single node failures occur frequently.Therefore,building highly reliable cloud applications is a challenging and critical research problem.The challenges of reliability-based improvement for cloud applications mainly lie in the following aspects:(1)There are different forms of cloud services which are built with different techniques and programming languages and have different focuses of reliability improvement.The reliability of cloud infrastructures is the basis of the reliability of the whole cloud systems.(2)The applications deployed in the cloud are usually complicated and consist of a large number of components.Only employing fault prevention and fault removal techniques are not sufficient due to the complexity of cloud applications and the dynamics of cloud resources.(3)Cloud applications often consist of many complex tasks and have a time-limited nature,which thus makes reliability assurance approaches need to meet both time and monetary constraints.(4)Cloud service architecture develops very fast.The emerging microservices architecture breaks the restraints of the monolithic architecture with its new features,which enables it more suitable for running in clouds.However,it also poses both new opportunities and challenges for fault tolerance design.To address these challenges,we focus our research work on reliability-oriented resource scheduling and management technologies in cloud computing environment.The major contributions of this thesis include:1.To tackle the problem that resource utilization and energy saving are overemphasized in cloud application infrastructure management,but reliability is not considered enough,this paper proposes a reliability optimization method for cloud application infrastructure based on available capacity control.The core idea of this method is to adopt different placement constraints for key virtual machines and non-key virtual machines in the process of virtual machine placement,and implement available capacity control forphysical machines according to the critical degree.The experimental results show that the virtual machine placement method proposed in this paper can effectively control the cost of resources and not lose the satisfaction rate of virtual machine requests,and ensure the reliability of virtual machine set,providing a reliable running environment for cloud applications.2.Based on the component attributes and structural characteristics of cloud application systems,this paper proposes a FSCRank method which considers component failure.This method integrates component reliability attributes and cloud application structure attributes into sorting algorithm.In order to solve the Dead End problem in the structurebased sorting algorithm,a new type of node,buffer node,is introduced in the structure diagram of cloud application.The improved structure chart avoids Dead End problem at first,and the buffer node can assign importance value according to the comprehensive failure effect of each node in the process of transferring system importance value,which makes the ranking effect of FSCRank significantly improved.We tested and compared the performance of FSCRank and two contrast algorithms in cloud application reliability optimization through a number of experiments,and tested and analyzed the factors that affect the performance of FSCRank and two contrast algorithms.The experimental results show that FSCRank is suitable for sorting components of cloud application systems and can effectively improve the reliability of cloud applications.3.To meet increasing demand of task complexity and data scale in cloud application,a dynamic workflow scheduling method based on hybrid temporal-spatial fault tolerance(DFTWS)is designed in this paper.The method consists of three stages: static information computing,resource pre-allocation and online scheduling.At the static information computing node,DFTWS calculates the time attributes of each task and identifies the critical path of the workflow.In the resource pre-allocation phase,DFTWS chooses the appropriate virtual machine type according to the flexible execution time and budget quota of each task.In the online scheduling stage,DFTWS will make real-time faulttolerant scheduling decisions according to the actual failure occurrence and task attributes to ensure the smooth execution of workflow.We used multiple real workflow models to test and analyze DFTWS,and tested and analyzed the influencing factors of DFTWS performance.The experimental results show that DFTWS achieves a balance between high reliability and low cost.4.Aiming at the frequent failure events in cloud environment,this paper designsa dynamic adaptive optimization framework for cloud application reliability considering monitoring information.Based on the existing cloud platform monitoring technology,this framework focuses on the research and development of cloud application testing methods,and proposes a non-intrusive testing framework for micro-service cloud applications.The optimization framework can collect the running information of cloud applications,as well as test and collect the running information of Cloud Applications in specific scenarios.The two parts of operation information complement each other,provide a more comprehensive basis for cloud application scenario analysis and reliability optimization strategy selection,so as to achieve the optimization of cloud application reliability.
Keywords/Search Tags:Cloud application, reliability, fault tolerant, virtual machine placement, workflow scheduling
PDF Full Text Request
Related items