| The air-ground integrated heterogeneous network(AGIHN)is an emerging heterogeneous information network based on ground networks,which integrates the unique advantages of aerial networks,and can effectively meet the demand for ubiquitous access and differentiated services of massive terminals.However,the frequent topology updates and dynamic node changes in the air-ground integrated heterogeneous network pose unprecedented and severe challenges for the joint optimization of the network’s multidimensional resources(e.g.,communication and computing resources).How to perform effective resource management,achieve intelligent optimization of network performance,and fully improve resource efficiency is a critical issue that urgently needs to be addressed.Meanwhile,with the widespread application of artificial intelligence technology in wireless communication,deep reinforcement learning algorithms have shown great potential in decision scheduling,providing new ideas for breaking through the above critical issues.Based on the above background and motivation,this paper conducts in-depth research on the task offloading and resource optimization issues in air-ground integrated heterogeneous networks.The main work includes:Firstly,to address the joint optimization problem of multidimensional network resources,an air-ground integrated heterogeneous network model is designed,in which multiple unmanned aerial vehicles(UAVs)and ground base stations(GBSs)provide collaborative edge computing services for user equipment.Furthermore,a joint offloading decision and resource allocation optimization problem is formulated with the goal of minimizing system energy consumption.In order to efficiently solve this mixed integer nonlinear programming problem,a deep Actor-Critic based online offloading for AGIHN(DACO2A)algorithm is proposed,which is able to interactively train the deep neural network between intelligent agent terminal and the air-ground integrated heterogeneous network environment,and finally obtains dynamic optimization of task offloading decision for user equipment.Then,the multidimensional resource allocation problem is redefined as a difference of convex(DC)programming problem,and the convex-concave procedure(CCP)is used to obtain power control and computing resource allocation strategies.Last but not least,a large number of simulation experiments are carried out to evaluate the performance of the proposed method.Last but not least,a large number of simulation experiments are carried out to evaluate the performance of the proposed method.The results showed that compared with the baseline algorithm,the proposed approach can reduce user equipment energy consumption by 7.26%,17.23%,and 23.14%,respectively.Secondly,building upon the aforementioned work,the focus is on the collaborative optimization problem of the aerial platform trajectory and network resources.By deploying High Altitude Platform Stations(HAPSs)and UAVs,communication access and edge computing services are provided for IoT terminals in remote areas beyond the coverage range of existing ground networks.In particular,UAV trajectory planning is further considered to take full advantage of flexible aerial network coverage and on-demand deployment.Specifically,a joint optimization problem of UAV trajectory,device offloading decision,and edge computing resource allocation is constructed to minimize the system cost of the air-ground integrated network.Due to the dynamic nature of network environment and the unpredictability of system state information,existing optimization methods cannot be directly used.Therefore,the optimization issue is modeled as a Markov decision process in this paper,and a multi-agent deep deterministic policy gradient(MADDPG)-based UAV trajectory and network resource optimization algorithm is proposed.Through centralized training and distributed execution architecture,it effectively promotes the collaboration of intelligent agents.At the same time,fully trained UAVs and terminals can execute trajectory planning and resource allocation decision based on only local observations information.Simulation results show that the proposed method can achieve a reasonable trade-off between network energy consumption and task execution delay,and significantly reduce system cost through the coordinated joint optimization of UAV trajectory and network resource.The performance of the proposed approach is 6.98%,18.37%,and 32.78%better than the baseline algorithm,respectively. |