Font Size: a A A

Design And Implementation Of Routing Algorithm Based On Transfer Reinforcement Learning

Posted on:2021-05-26Degree:MasterType:Thesis
Country:ChinaCandidate:R Q ZhangFull Text:PDF
GTID:2518306050465404Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of the Internet,cloud computing technologies and the growth of network applications and services,computer network is becoming more complex,at the same time,the traffic of the network is dynamic,therefore routing algorithm design is a difficult decision-making problem.The traditional routing algorithm based on heuristics rules relying on artificial understanding of the workload and network environment.Designing and implementing proper routing algorithms thus take at least weeks,it is inefficient and parameter setting and optimization is difficult,therefore,designing suitable routing algorithms that can balance the traffic load in different network scenarios is a challenging task.In this paper,we apply reinforcement learning to routing algorithms that can fit the mapping relationship between network state and routing-strategy by neutral network.The routing algorithnm can select the best routing-strategy without any humanintervention.However,the current reinforcement learning routing algorithm still has problems such as poor convergence when the routing decision space is large,high data collection costs and large computational overhead.Current reinforcement learning routing algorithm can balance network load poorly when the network environments is complex,and search in a large space to find acceptable routingstrategies is a difficult task.To solve this problem,we present the DEBU(Difference of the Equal-cost-path Bandwidth Utilization)in this paper,a reinforcement-learning-based load balancing routing algorithm.This algorithm is performed by asynchronous training of reinforcement learning agents in a distributed setting.We discourage premature convergence to suboptimal deterministic policies and improve exploration by executing multiple agents in parallel,on multiple instances of the network environment.The algorithm observes previous network environment states,then learns how to decide a appropriate set of split ratios for the traffic load in order to minimizing the difference of the equal-cost paths bandwidth utilization and make the network load more balance.We test it with both Fat-tree and randomly generated network topologies and achieve superior performance than existing approaches.The network throughput of the proposed algorithm is increased by at least 12% compared with weight-based reinforcement learning routing algorithm,and increased by at least 8% compared with DDPG-based multi-path routing algorithm.Collecting enough network traffic data and training the reinforment learning algorithm will cost long time,therefore,this paper proposes an optimization scheme for routing decision model based on transfer learning.This solution uses a feature-representation transfer learning method.In order to solve the problem of lack of data in the target network,this method can learn a set of common transfer components underlying both datasets such that the difference in data distributions of the different datasets can be dramatically reduced by minimizing the maximum mean discrepancy distance between experimental network dataset and target network dataset.In order to make the algorithm converge faster,the model-based transfer learning method can preserve the common knowledge between experimental network and target network by fine-tuning the first n layers of the model trained in the experimental network.Simulation results show that the feature-representation transfer learning method can reduce the distance in data distributions between two different datasets and it allows us to train a regression model even for small datasets.The network throughput of the proposed algorithm is increased by 8.7%.Compared with the method of randomly initializing parameters,given a pre-trained model in the experimental network,this stage converges faster as it only needs to adapt to the idiosyncrasies of the target network data.The model training time is reduced by 13.7% and the model is more robust.
Keywords/Search Tags:Reinforcement learning, Transfer learning, Neural Network, Load balance, Routing algorithm
PDF Full Text Request
Related items