| Reducing the cost of Virtual Network Function(VNF)placement,achieving system load balancing and improving the overall benefits of service providers are still the focus of network functions virtualization.The existing methods are mainly divided into heuristic algorithms and machine learning-based algorithms.However,the above two types of algorithms can not well meet various complex business function requirements under the current 5G support.On the one hand,when the heuristic algorithm encounters a large problem size,the solution quality will decline,and eventually output solution can not meet the demand of service providers.On the other hand,The performance of traditional machine learning-based algorithms is not satisfactory when faced with high-complexity businesses.In this thesis,a virtual network function placement and load balancing model based on reinforcement learning is proposed.The main contents and contributions are as follows:(1)This thesis from the point of edge service providers regards various IT resource charges required for providing services as system revenue,service capability and energy consumption as system expenditure,thus establishes a Mixed Integer Programming(MIP)model to describe the placement of virtual network function on distributed edge nodes.This thesis solves the proposed optimization problem by constructing an Actor-Critic based Deep Reinforcement Learning(DRL)framework called RL-GCN.This thesis consider the connections between multiple VNFs in a virtual network as a more generalized mesh structure which adapt to various complex functional requirements,by further introducing Graph Convolutional Network(GCN)to perform feature extraction on the results as the input of the framework to improve the output quality of the solution.The experimental results show that the RL-GCN proposed in this paper can significantly improve the solution quality on the basis of ensuring the solution efficiency,and significantly improve the system revenue.(2)On the basis of RL-GCN,an improved load balancing algorithm WFRL-LBP combining heuristic method and deep reinforcement learning is proposed.This algorithm contains two sub-algorithms: WFD-LBP and RL-LBP.The heuristic-based WFD-LBP algorithm is designed based on the worst-fit decreasing algorithm,and the RL-LBP algorithm is improved on the basis of RL-GCN.When the system load is low,make full use of the advantages of the heuristic method in solution speed,and use the WFD-LBP algorithm to quickly deploy virtual network functions in a short time and reduce the response time.When the system load is high,make full use of the advantages of deep reinforcement learning in solution quality,and use RL-LBP to migrate and adjust the deployed virtual network functions through live migration operations without service interruption.Experiments show that WFRL-LBP algorithm can effectively reduce response time and container communication cost. |