Mobile Edge Computing(MEC)effectively alleviates the pressure of Internet of Things(Io T)devices on computing-intensive tasks and low latency requirements.In the scenarios with constrained infrastructure,immovable base stations limit the flexibility of MEC networks.Unmanned aerial vehicle(UAV)can be well equipped with MEC servers,and make up for the limitations of fixed base station MEC network.Besides,UAVs are recognized for their ease of deployment,low cost,and high mobility.They provide users with edge computing services in scenarios with computing-intensive tasks,thus reducing user energy consumption and improving Quality of Service(Qo S).In recent years,the dynamic offloading and resource allocation issues in UAV-assisted MEC networks have attracted the attention of many researchers.However,considering the random arrival of tasks and the dynamic changes in the channel environment,achieving a dynamic offloading and resource allocation scheme in multi-UAV-assisted scenarios is a great challenge.Traditional optimization schemes rely on accurate network information,such as task arrival and channel condition statistics,making it difficult to solve the aforementioned challenges.On the other hand,model-free Deep Reinforcement Learning(DRL)can effectively handle perceptual decision-based problems in complex networks.Under the DRL framework,agents interact with dynamic MEC environments in a "trial and error" manner,aiming to maximize cumulative rewards through reward design without capturing precise environmental dynamics.Based on this,this paper uses the DRL method to conduct in-depth research on dynamic offloading and resource allocation in multi-UAV-assisted edge computing networks.The research work of this paper is summarized as follows:First,for multi-user MEC scenarios assisted by multiple UAVs,aiming at minimizing long-term average weighted user energy consumption,a joint optimization scheme of trajectories of UAVs,user association,and subchannel allocation is proposed to alleviate the problems of high user energy consumption and high latency.Initially,the network model is constructed for multi-UAV-assisted multi-user MEC scenarios.Moreover,constraint conditions are combined to establish the average weighted user energy consumption minimization problem.Besides,the above optimization problem is modeled as a Markov Decision Process(MDP),and a hybrid decision DRL-based algorithm,named HDRT algorithm,is proposed to handle dynamic hybrid decision problems in scenarios.Finally,the simulation results show that the proposed algorithm has better convergence when compared to algorithms such as DDPG and DQN,and performs better in reducing user energy consumption and latency.In addition,it also indicates that multi-UAV-assisted MEC networks are superior to singleUAV-assisted MEC networks.Second,for multi-UAV-assisted air-ground collaborative MEC scenarios,aiming at minimizing long-term average latency,a joint optimization scheme of user association,subchannel allocation,and MEC server computation resource allocation is proposed to alleviate the problem of high user latency.Initially,the network model is constructed for the multi-user MEC scenario of air-ground collaboration.Moreover,combined with constraint conditions,the goal of minimizing long-term average delay is proposed.Besides,a hybrid decision DRL-based algorithm,named HDCR algorithm,is proposed to obtain the optimal user association and resource allocation decisions.Finally,the simulation results show that the proposed algorithm performs better in reducing average latency,compared with algorithms such as DQN.In addition,the proposed algorithm-controlled cooperative scheme can achieve lower latency than the other noncooperative schemes. |