Research On Radio Resource Scheduling Algorithm For Mobile Communications Based On Deep Reinforcement Learning And User Location Information

Posted on:2023-09-20

Degree:Master

Type:Thesis

Country:China

Candidate:X N Li

Full Text:PDF

GTID:2558306908965279

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

As the scale of mobile communication networks becomes larger,interference more complex and services more diversified,the contradiction between the shortage of spectrum resources and the increasing demand of users becomes more and more serious.In order to effectively improve spectrum resource utilization and user service quality in more complex communication environments,it will be extremely important to study more efficient resource scheduling algorithms.Therefore,this thesis improves the performance of resource scheduling algorithms by further mining the intrinsic knowledge of mobile communication systems and combining deep reinforcement learning.The specific work is as follows:First,this thesis introduces the technical theories of deep reinforcement learning and resource scheduling,and proposes a novel resource scheduling algorithm called user location based proximal policy optimization(PPO)with reward shaping(UL-PPORS).On the one hand,the user location network is designed to help the agent to obtain the location information of the user without additional signaling overhead,thus providing more valuable state information for the agent.On the other hand,the reward function is effectively shaped for large,medium and small traffic intensity respectively,so as to ensure that the agent can obtain higher spectral efficiency under different traffic intensity.Subsequently,the above algorithm design is verified by simulation.Through the comprehensive analysis of cumulative reward convergence curve,system throughput convergence curve,system resource occupancy curve and system interference curve,the effectiveness of the reward shaping function design in this thesis is verified.Then,based on the above reward shaping function,the UL-PPORS algorithm and PPO resource allocation algorithm without user location information(PPO algorithm)are simulated and compared under different traffic intensities.The results show that UL-PPORS algorithm can effectively mitigate system interference and achieve higher spectral efficiency.Compared with the PPO algorithm,the spectral efficiency of UL-PPORS algorithm can be improved up to 10.2%.Finally,based on the UL-PPORS algorithm,this thesis proposes a novel resource scheduling algorithm called graph convolutional network based PPO with reward shaping(GCNPPORS)to further improve the performance of the resource scheduling algorithm.The algorithm uses the user location information obtained from the localization network to model the irregular mobile communication scenario into a graph structure that can reflect the interference relationship between users.Meanwhile,a policy network based on the structural model of graph convolutional network is designed to further optimize the resource scheduling policy by taking advantage of the ability of graph convolutional network to extract the spatial features of graph data effectively.After that,the GCN-PPORS algorithm and UL-PPORS algorithm are compared and analyzed under the same simulation parameter settings.The results show that GCN-PPORS algorithm can obtain a resource allocation strategy with lower interference and higher system throughput,and the algorithm still has excellent performance as the scale of the communication network increases.Compared with UL-PPORS algorithm,the spectral efficiency of GCN-PPORS algorithm can be improved up to 17.8%.

Keywords/Search Tags:

Resource Scheduling, Deep Reinforcement Learning, User Location Information, Proximal Policy Optimization, Reward Shaping, Graph Convolutional Network

PDF Full Text Request

Related items

1	Research On Robotic Arm Tracking And Grabbing Control Based On Fusion Reward PPO Algorithm
2	Research And Application Of Deep Reinforcenment Learning Algorithms Based On Reward Shaping
3	Research On Job Shop Scheduling Optimization For Multi Variety And Small Batch
4	Research On Downlink Radio Resource Scheduling Algorithm Based On Deep Reinforcement Learning
5	Robust Policy Gadient Algorithm Based On Actor-Critic In Deep Reinforcement Learning
6	Research On Agent Decision-making And Control Based On Deep Reinforcement Learning
7	Research On Robotic Arm Grabbing Method Based On Deep Reinforcement Learning
8	Research On Positioning And Navigation Of Mobile Robot Based On Reinforcement Learning
9	Research And Application Of Reward Shaping Based Reinforcement Learning
10	Research On Model And Algorithm Of Spectrum Resource Sharing Based On Deep Reinforcement Learning In Cognitive Wireless Network