Font Size: a A A

Research On Deep Reinforcement Learning Based Multi-uav Trajectory Optimization

Posted on:2023-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:S L WuFull Text:PDF
GTID:2532306914471864Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In the future,mobile communication networks will face diverse challenges.Emergency communication is one of the typical scenarios of the sixth generation mobile communication technology(6G).Benefited from the advantages of flexible deployment and little impact from terrestrial disasters,unmanned aerial vehicle(UAV)communication has become a general technique for emergency communication.Equipped with the lightweight base station(BS),UAV can deploy in the disaster area and reasonably control position or trajectory to provide high-quality air-to-ground emergency communication service for users.However,it is extremely challenging to optimize the trajectories of multiple UAV-BSs in the dynamic and unknown emergency communication networks.In order to tackle this issue,deep reinforcement learning(DRL)is applied to jointly control the trajectories of multiple UAV-BSs to meet the communication coverage requirements and improve the spectrum efficiency of networks.The main contributions are as follows.First,a single-agent DRL-based centralized trajectory optimization method is proposed for small-scale multi-UAV emergency communication networks.Specifically,user,network,and evaluation model of the small-scale multi-UAV emergency communication networks are analyzed in details.Then,the centralized trajectory optimization problem for multiple UAV-BSs is formulated,and Bayesian soft actor critic(BSAC)based centralized trajectory optimization method is proposed,realizing the deployment of an aerial trajectory optimization center to improve the performance of networks.The simulation results illustrate that the proposed BSAC based method can effectively reduce the communication interruption and improve the spectrum efficiency in small-scale multi-UAV emergency communication networks.Second,considering the high communication delay and overhead,and the weak expansion ability of centralized trajectory optimization method,a multi-agent DRL-based distributed trajectory optimization method is further proposed for large-scale multi-UAV emergency communication networks.Specifically,the system model of large-scale multi-UAV emergency communication networks are analyzed and the distributed trajectory optimization problem are formulated.Then,in view of the non-stationary problem caused by distributed implementation of single-agent DRL,multi-agent soft actor critic(MASAC)based distributed trajectory optimization method is proposed with the distributed-training-distributed-execution design:all UAV-BSs perform distributed k-sums clustering for users and multi-agent DRL for distributed control of trajectories.Further,integrated with federated learning,ensemble learning,and curriculum learning techniques,training enhancement methods for multi-agent DRL is proposed,which can improve the convergence rate and stability.The simulation results illustrate that the proposed MASAC based distributed trajectory optimization method can effectively reduce the communication interruption and improve the spectrum efficiency in large-scale multi-UAV emergency communication networks,which outperforms the existing DRL-based methods.
Keywords/Search Tags:emergency communication networks, unmanned aerial vehicle, trajectory optimization, deep reinforcement learning, multi-agent
PDF Full Text Request
Related items