Font Size: a A A

Research On Network Utility Optimization Based On Reinforcement Learning

Posted on:2022-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:J HuFull Text:PDF
GTID:2518306572491354Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Network Function Virtualization(NFV),as an emerging solution to virtualize network services traditionally running on dedicated devices,can effectively improve the flexibility and scalability of the network and reduce the cost of deployment and administration by running a chain of ordered Virtual Network Functions(VNFs)on commodity hardware.At the same time,hosting VNF to provide services on platforms such as cloud computing has become a promising solution for service providers.Considering the cost diversity in cloud computing,from the perspective of service providers,it is significant to orchestrate the VNFs and schedule the traffic flows for Network Utility Maximization(NUM)as it implies maximal revenue.However,traditional mathematical model-based heuristic solutions usually make impractical assumptions when solving NUM problem,which limits their applicability in practical applications.In order to tackle such limitations,a VNF scheduling algorithm based on Deep Deterministic Policy Gradient(DDPG)is proposed,inspired by the successful application of Deep Reinforcement Learning(DRL)to the field of network control.However,when the scale of NUM problem becomes larger,the slow convergence speed of DDPG-based algorithm gradually becomes prominent.In order to speed up the training procedure of DDPG-based algorithm,a model-assisted accelerate solution is designed by using the traditional optimization model as the guidance.Different from traditional DDPG algorithm which make the agent blindly explore the actions,the proposed model-assisted DDPG(m DDPG)uses heuristic solutions based on optimization models to guide agent's exploratory behavior in the early stage of training by adding a profiling module,to improve the exploration efficiency of DDPG agent.Then,the m DDPG algorithm is applied to solve the VNF orchestration and flow scheduling problem.By constantly exploring the network environment,the m DDPG algorithm learns the changing rules of user requests and the relationship between traffic allocation and service delay,adaptive orchestrating VNF and scheduling traffic flow,so as to achieve the goal of long-term network utility maximization.Extensive simulation experiments based on real-world traces verify the effectiveness and efficiency of the model-assisted DRL framework.Compared with the DDPG algorithm,the proposed m DDPG algorithm converges 56% faster,and the network utility improves 43%after convergence.
Keywords/Search Tags:Deep Reinforcement Learning, VNF Orchestration, Flow Scheduling, Network Utility Maximization
PDF Full Text Request
Related items