Font Size: a A A

Learning To Schedule Dynamic Virtual Machines Via Reinforcement Learning

Posted on:2022-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q HuFull Text:PDF
GTID:2518306479993259Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the cloud computing,the importance of large-scale dynamic virtual machine scheduling is increasing.The cloud service provider purchases a large number of physical servers,establishes a data center,and then divides and rents the computing resources of the servers in the form of creating virtual machines and profit from them.Designing an excellent dynamic virtual machine scheduling system can save a lot of cost for cloud service providers,which has become a major research hotspot in recent years.The most difficult issue of practical virtual machine schedulingis that the creation requests and deletion requests are usually proposed irregularly and arbitrarily.Existing scheduling algorithms usually model the dynamic virtual machine scheduling problem as a dynamic vector boxing problem,while the server architecture of multiple non-uniform memory accesses sually has not been considered explicitly by existing scheduling algorithms.In fact,in order to provide virtual machines with larger specifications,servers with non-uniform memory access architecture have been widely used by cloud service providers.This special server architecture brings a new virtual machine scheduling mechanism,which brings new challenges to the virtual machine scheduling problem,and a new scheduling system is urgently needed to solve it.In addition,the existing dynamic scheduling algorithms have problems such as insufficient solution efficiency,only based on local information,and inability to use historical data.o address the above challenges,the works of this article are as:1.Propose a formal model of dynamic virtual machine scheduling with nonuniform memory access architecture.Since the existing work does not involve nonuniform memory access architecture,there is no clear mathematical expression for this problem.This paper combs the logic of the problem,and uses a formal method to model it as a constrained combinatorial optimization problem.2.Propose a single-agent dynamic virtual machine scheduling algorithm based on deep Q-network.Since it is difficult to directly solve the proposed constrained combinatorial optimization problem,this paper starts reconstructs it into a Markov decision process from single-agent's view,and proposes a reinforcement learning scheduling algorithm Sched RL based on double-depth Q-network to perform an approximate solution.Because of the inefficient sampling problem caused by sparse rewards in naive modeling,this paper designs a special difference based reward function and a scenedriven efficient sampling mechanism.3.Propose a multi-agent dynamic virtual machine scheduling algorithm based on value decomposition network.As the scale of the problem increases,the singleagent method can cause the explosion of state space and action space.Therefore,this paper reconstructs the Markov decision process of the original from multi-agent's view,and proposes Sched MARL,a reinforcement learning scheduling algorithm based on the value decomposition network.In this process,a special reward function and an efficient sampling mechanism are also used.In order to train and evaluate the proposed algorithm,this paper develops a simulation system for the dynamic virtual machine scheduling process based on the nonuniform memory access architecture,and designs two experimental methods that only create scenes and ordinary scenes.In two different scenarios,this paper evaluates the proposed algorithm on the public data set of Azure,including baseline testing,reward function research,and sampling strategy ablation experiment.Compared with the traditional greedy algorithm,Sched RL performs better in both scenarios.Sched MARL can achieve faster convergence speed in only creating scene,but it is not suitable for ordinary scene.The reward function research shows the superiority of the difference reward function proposed in this paper.The sampling strategy ablation experiment inspired the design ideas of sampling parameters.
Keywords/Search Tags:Dynamic virtual machine scheduling, Multiple non-uniform memory accesses, Reinforcement learning, Cloud computing
PDF Full Text Request
Related items