Font Size: a A A

Research On UAV-Aided Communication Based On Reinforcement Learning

Posted on:2021-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhaoFull Text:PDF
GTID:2492306338985559Subject:Information and Communication Engineering
Abstract/Summary:
Due to their cost effectiveness,low power consumption and high maneuverability,unmanned aerial vehicles(UAVs)are commonly employed to be various communication platform,which significantly enhances the performance,reliability and generalization of wireless system compared with conventional static communication infrastructure,and could be entensively used for the next-generation wireless technology.However,in many practical settings,without specific communication resource applied,some system information becomes unaccessible,which brings great difficulties to tackle the optimization problems such as UAV trajectory design,user schedule and power allocation via conventional method.Fortunately,model-free reinforcement learning could make wise decisions not relying on the observability of system information and the convexity of target function.Therefore,the remarkable flexibility and generalization ability of reinforcement learning algorithms exploits the advantages more effectively for practical situations.Based on such motivation,we consider a multi UAVs-ground system where UAVs serve as mobile base stations for users.Without fully information of users,UAV trajectory,users scheduling and transmission power allocation are jointly optimized aimed to maximize downlink sum rate and.The reinforcement learning algorithm is applied for the problem,which is formulated as a Dec-POMDP.The feasibility of multi-agent reinforcement learning method addressing such decentralized cooperative tasks is demostrated by simulation results.As for users’fairness problem,.we also propose a off-line update reinforcement learning with a combination reward function defined.Simulation results demonstrated that the algorithm outperforms the conventional optimization algorithm with a good convergance while facing the sequence decision making problems aimed to maximize a linear-decomposable objective function on time horizon.
Keywords/Search Tags:UAV-aided communcations, jointly optimization, Dec-POMDP, multi-agents, reinforcement learning
Related items