Font Size: a A A

Research On Algorithms Of Routing And Resource Allocation Based On Reinforcement Learning In D2D Networks

Posted on:2020-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:D Y LiuFull Text:PDF
GTID:2428330572976381Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development of communication technology,device-to-device(D2D)technology has been widely concerned recently.It will meet the increasing traffic demand of users.However,applicating D2D technology aggravates the interference conflict within cellular networks,and makes users difficult to meet the requirements of quality-of-service(QoS).Some traditional algorithms can obtain the network control strategies at each sampling time based on the "snap"information,but they are difficult to adapt to the environments which are complex,changeable and highly dynamic.Therefore,this paper studies dynamic communication problems in D2D networks,and proposes an intelligent solution based on the emerging machine learning technology.In this paper,we consider two kinds of D2D application scenarios which are multi-hop D2D networks and D2D direct communication.To solve the dynamic communication problems in both scenarios,we propose an online learning method based on reinforcement learning(RL)algorithm.As the complexity of the problem increasing,RL algorithm goes from simple to deep.To solve the routing problem,we propose a QoS routing algorithm based on traditional value iteration algorithm.And for the resource allocation problem,we propose two kinds of resource allocation algorithms which are based on deep Q-learning(DQN)and deep deterministic policy gradient(DDPG)respectively.DQN and DDPG are classical algorithms in deep reinforcement learning(DRL).In the multi-hop D2D networks,we consider three kinds of changeable QoS indices in a dynamic environment,and we propose a value iteration algorithm to solve this problem.At the same time,we utilize the distribution architecture and greatly reduces the costs of learning and searching.The simulation results show that the proposed algorithm performs better than the traditional algorithm in terms of performance of QoS and time complexity respectively in dynamic environment.In the D2D direct communication scenario,we consider two kinds of resource reuse scenarios:single channel and multi-channel.In the problems,moving users cause a dynamic network environment and the agent which use the algorithm of DRL can achieve the goal of self-learning,self-optimization and intelligent control through exploration and environmental feedbacks.For the single channel resource allocation problem,we obtain the power control strategy of D2D on single channel through DQN algorithm and DDPG algorithm.And for the multi-channel resource allocation,the total transmission power of D2D can be allocated unevenly to each channel resource through DDPG,which can promote total throughput of cellular network.The simulation results show that both DQN and DDPG algorithms are intelligent and take better performance compare to traditional algorithms.At the same time,we find that DQN algorithm is prone to fall into the problem of "pseudo convergence",so we also propose an optimization method of "sample weighting" to solve the problem effectively.
Keywords/Search Tags:D2D communication, resource allocation, distributed reinforcement learning, deep reinforcement learning
PDF Full Text Request
Related items