In cloud computing,virtualization technology is commonly used to achieve efficient management of resources,and optimization of virtual machine placement is one of the important means to improve the performance of cloud computing systems.An efficient virtual machine placement strategy not only saves a lot of costs for cloud service providers,but also reduces carbon emissions and protects the ecological environment.Virtual machine placement refers to how to use the scheduling module to continuously adjust the mapping relationship between virtual machines and physical nodes to achieve the best placement strategy and optimize the predetermined target.Most of the existing virtual machine placement management system aim at the lowest energy consumption or the highest resource utilization.In the modeling process,the impact of resource competition on performance is ignored,and the server architecture using non-uniform memory access architecture is not explicitly considered.The non-uniform memory access structure is widely used in cloud computing due to its scalability,but the characteristics of its remote memory access also cause the system to generate more competition for shared resources,which in turn leads to a decline in system performance.Therefore,a new scheduling module is urgently needed to improve system performance.For cloud computing centers with non-uniform memory access architecture,this thesis proposes a virtual machine placement algorithm based reinforcement Learning,which is effectively to handle various types of shared resource competitions.First,the virtual machine scheduling problem is transformed into a virtual machine placement decision problem by using the defined virtual machine array.The cross-entropy method is used as a sampling technique.The discrete probability distribution is used to represent the location of the virtual machine bound the physical node,and the maximum likelihood estimation is used to obtain the best probability distribution quality,and the average number of instructions executed by the processor per clock cycle is used as a measure to evaluate the competition for shared resources in the system,and it is abstracted as the objective function.In this thesis,the Post algorithm based on reinforcement learning is proposed to solve the target realization strategy update,and finally the multi-model decision is used to find out the maximum target value,so as to obtain the optimal placement strategy.During the experiment,the placement model of virtual machines under the non-uniform memory access architecture was constructed,and under the same environment,the algorithm Post proposed in this thesis was compared with current supervised learning algorithm,policy gradient algorithm,proximal policy optimization algorithm and cross-entropy algorithm based on supervised learning,and the experimental results show that the effect of multi-model decision making was significantly better than that of single-model decision-making,and the Post algorithm has excellent performance in many aspects: 1)The overall solution optimal value is greater than other algorithms;2)The performance is better in the iterative optimization time,and the total time overhead of the algorithm is smaller;3)It is more robust.Based on the experimental results,the Post algorithm can reasonably place virtual machines under the non-consistent memory access architecture.By adjusting the binding relationship between virtual machines and nodes,the optimal solution can be obtained in a short time,and reducing the competition of shared resources of virtual machines and promoting the overall performance of the system. |