As a part of mobile edge computing system,UAV communication network can provide sustainable services for computing intensive tasks on mobile devices.However,considering the system complexity and resource limitations of drone clusters,there are still challenges for drone communication networks to provide stable and reliable services in scenarios where mobile devices are densely distributed.For example,the limited access range of mobile devices and the scale of device clusters affect the access rate of drone communication networks to mobile devices;In addition,the process of drones accessing mobile devices and transmitting computing tasks can increase latency and generate energy loss,resulting in poor service quality.In response to the above challenges,this paper focuses on deep reinforcement learning and investigates the optimization of distributed computing offloading for unmanned aerial vehicle clusters in multiple mobile device scenarios from two different levels.The main research content and achievements are as follows:(1)Flight trajectory control algorithm of UAV cluster in mobile edge computingAiming at the flight path control problem of distributed UAV clusters in mobile edge computing scenarios,this paper proposes a two-stage scheme based on clustering and deep reinforcement learning:in the first stage,the centroid set of mobile device clusters is obtained through clustering algorithm to determine the formation of UAV clusters;the second stage developed a collaborative flight strategy for the unmanned aerial vehicle communication network through the application of consensus active mechanism in deep reinforcement learning algorithms.The superiority of this scheme was verified through simulation experiments.(2)Optimization algorithm of computing offloading for UAV cluster in mobile edge computingAiming at the low quality of service problem of UAV cluster in mobile edge computing scenarios,this paper introduces computing power,bandwidth and storage resources based on the research results of the above UAV cluster flight path control algorithm.First,according to the delay and energy consumption of mobile edge computing system,it determines that the optimization goal is to minimize the average delay and energy consumption weighted system cost,Then,based on this,an improved priority experience playback dual deep Q network algorithm was proposed,which reduces the state space dimension through data preprocessing.After experimental verification,this scheme has achieved the optimization goal of minimizing average cost. |