| 5G mobile communication puts forward the vision of the Internet of things,aiming to provide more diversified business services for more devices.Various kinds of devices generate services at different times and spaces,and are with different requirements for communication,computing,and storage resources.Therefore,the total amount of communication and computing workload in wireless networks is increasing,but the space-time distribution of that is significantly uneven.It is with a huge cost of setting up a large number of base stations to achieve seamless coverage of resources.By deploying communication,computing,and storage resources on the UAV,part of the data flow of "users-infrastructure"links can be offloaded to "users-UAVs" links,and part of the computing tasks can be offloaded from cellular networks to the UAVs for execution.Thus,the communication and computation workload of the networks can be balanced,and the efficient allocation and utilization of resources can be realized.However,at the same time,the introduction of UAV also brings some challenges:First,when the distance between a UAV and a ground node is long while the altitude of the UAV is low,the global information of the air to ground channel states is unknown,which leads to the deterioration of the average channel pathloss and the decrease of the service delivery rate;second,due to the influence of the platform instability,some UAVs may leave the network caused by the force majeure,which increases the uncertainty of the network topology and restricts further improvement of users’ average data rate when considering the multi-UAV assisted scenario;third,since a UAV’s load capacity is limited,there exist multi-dimensional resource constraints,which makes the interaction of offloading strategies between different computing tasks complex and tightly coupled,and it is difficult to solve the optimal joint allocation strategy of computing and communication resources,resulting in the increase of computing task completion time.To this end,this thesis studies data and computation offloading in UAV-assisted wireless networks,and provide efficient load allocation,resource allocation,and UAV scheduling schemes.This thesis adopts deep reinforcement learning approaches to exploit the regularities in services,channel,and node movement,which can deal with the unknown and dynamic scenarios and the resource constraints.The main contents of this thesis are given as follows.For the scenario of large-scale sensors accessing the network,this thesis studies the data offloading scheme in UAV-assisted information collection.In this scenario,a UAV serves as a relay node,and some of the data sent by sensors are offloaded from a heavy load cell to another light load cell.This thesis presents the utility function of UAV service quality which is related to the service first-time delivery rate,constructs the optimization problem of maximizing UAV utility function,and decomposes it into two subproblems of access point selection and UAV path planning.The access point selection subproblem is solved by a game theory-based algorithm.By designing a potential function,the existence of Nash equilibrium is proved,and this thesis designs an algorithm to obtain the solution of Nash equilibrium.The UAV path planning subproblem is solved by an algorithm based on double deep Q-network(DDQN)by exploiting the channel’s spatial correlation regularity,which can deal with the challenges of high pathloss and service transmission success rate reduction caused by unknown global channel conditions.Prioritized sampling is adopted in the training process to deal with the sparsity of reward.Simulation results show that compared with the access point selection scheme based on greedy algorithm and the path planning scheme of UAV flying directly to the hot regions,the proposed scheme can improve the service first-time delivery rate by 13.2%and reduce the average channel pathloss by 3.4dB.For the scenario of a large number of vehicle users generating multimedia content needs,this thesis studies the data offloading scheme in UAV-assisted content distribution.In this scenario,some UAVs cache some content files proactively for providing content services for vehicle users.This thesis formulates the optimization problem of maximizing the average data rate of users,and decomposes it into two subproblems of client association and UAV scheduling.The client association is the "vehicle-communication resource" association,and is solved by the Modified Gale-Shapley(MGS)matching algorithm which improves the stability of the matching process compared with the classic matching algorithm.The UAV scheduling is to update the position and cached contents of the UAVs continuously to adapt to the changes of ground traffic conditions,and is solved by a content updating algorithm based on the maximum expectation and a path planning algorithm based on the mean-field multi-agent deep deterministic policy gradient(MF-MADDPG).MF-MADDPG exploits the different numbers of UAVs’adaptability to the changes of traffic flow and cached content,and can deal with the challenge of network topology uncertainty which does harm to the user data rate.Simulation results show that compared with the client association scheme based on the classic matching algorithm and the UAV path planning scheme based on the classic learning algorithms,the proposed scheme can improve the user data rate by 5.3%~16.0%.For the scenario of continuous generation of complex computing missions in cellular networks,this thesis studies the computation offloading scheme in UAVassisted edge computing.In this scenario,some UAVs form an edge cloud to help the ground base station execute part of the computing tasks.This thesis considers the complex logical relationship between computing tasks,and formulates the queueing,transmission,and processing model of computing tasks.To decouple the evaluation of computation and communication resource allocation,the two dimensions of decision-making are executed by two agents.Based on the idea of counterfactual multi-agent learning,each agent is designed with its own evaluation function,and two multi-actors-critic algorithms(Multi-ActorsCritic,MAC;Off-Policy Multi-Actors-Critic,OP-MAC)are proposed to exploit the interaction between computation and communication resource allocation strategies,which can deal with the challenges of tight coupling of task offloading strategies and increasing task completion time caused by multi-dimensional resource constraints.Simulation results show that the proposed scheme can reduce the mission completion time by 40.8%and 24.2%,respectively,compared with the computation offloading schemes based on the single-agent learning algorithm and the greedy algorithm,respectively. |