Font Size: a A A

Research On Data Center Network Traffic Scheduling Based On Deep Reinforcement Learning

Posted on:2022-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:J J LuFull Text:PDF
GTID:2518306491967359Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
There are mix-flows in the Software-Defined Datacenter network.Elephant flows without deadline requirement and high bandwidth consumption and mice flows with strict deadline and delay sensitive compete with each other for limited network resources.Therefore,scheduling mix-flows effectively becomes a challenging problem.The main work of this paper is as follows,(1)Combined with software-defined network(SDN)technology,this paper proposes a deep reinforcement learning with private link approach(DRL-PLink)based on private link and deep reinforcement learning(DRL).DRL-PLink divides the links into different types of flows in the network and establishes corresponding private links to isolate different types of flows,so as to reduce the competition between different types of flows.DRL-PLink uses deep reinforcement learning to adaptively allocate bandwidth resources for these private links,which makes the network scheduling policy intelligent.Based on DDPG algorithm,DRL-PLink introduces clipped double Q-learning to solve the problem of overestimation of value in the algorithm,and optimizes the scheduling policy obtained by training.In this paper,experiments are carried out in the Ryu + mininet simulation datacenter network environment.The simulation results under the actual data center network traffic load(web search and data mining load)show that DRL Plink maintains a high deadline meet rate(> 97%)similar to pfabric and Karuna.The average flow completion time is reduced by 65.6% and 57.12% respectively,and is better than ECMP and pfabric.(2)In order to increase the convergence speed of the algorithm and overcome the shortcomings of insufficient exploration ability in training,this paper introduces priority experience replay buffer mechanism and Noisy Net mechanism into DDPG of clipped double Q-learning algorithm.The priority experience replay buffer mechanism aims to optimize the input data of the model.By marking the priority of the data in the buffer,the deep neural network model can learn more valuable data in each training,which can accelerate the learning efficiency of the deep neural network.Noisy Net mechanism optimizes the exploration efficiency of the model.By introducing noise parameters into the weight of the neural network,and the disturbance of the algorithm’s exploration action is increased,which is more conducive to the model’s exploration of the optimal action.Experiment results show that the DMR of the improved algorithm is about 0.2% higher than that of the previous scheme in mice flow,while the average FCT of elephant flow is about 17.3% higher.At the same time,the cost of DRLPLink is small,the memory utilization is between 1.5% and 1.7%,while the CPU utilization is less than 1.7%.(3)In order to reduce the complexity of the flow scheduling policy model to achieve the deployment overhead on edge devices,and also to increase the interpretability of the model.In this paper,we use imitation learning method to compress and extract the complex deep neural network model,design the professor model and student model,and then use DAgger algorithm to extract the policy of the professor model into the decision tree model.Experiments based on the simulation model show that the occupied space size of the student model is about 60% lower than that of the professor model,and the DMR of scheduling mice flow is about 97%.
Keywords/Search Tags:Deep reinforcement learning, Mix-flow scheduling, Private link, Software-Defined Datacenter networks, Imitation learning
PDF Full Text Request
Related items