Font Size: a A A

Deep Reinforcement Learning-based Optimal Transmission Policies For Opportunistic UAVs-aided Wireless Sensor Network

Posted on:2022-11-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y T LiuFull Text:PDF
GTID:2518306758492394Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
The increasing development of next generation wireless networks sets an unparalleled criterion for high quality services.With the advantages of maneuverability and high probability of line-of-sight(Lo S)transmission,unmanned aerial vehicles(UAVs)are envisioned to play a key role in future wireless communication systems.There are two main research directions for air-ground integrated wireless networks with UAVs,i.e.the UAV-assisted terrestrial communications and the cellular-connected UAV communications.The combination of the advantages of both research directions is necessary and meaningful to be studied,which is an independent problem compared with the aforementioned ones.When there are UAVs performing their specifically assigned tasks in the air,some of them still have available resources to access different ground communication networks to improve their communication performance,especially for the wireless sensor network.Technically,when they execute their own given missions with predetermined trajectories,they can also provide opportunistic assistance for terrestrial networks at the same time.In this paper,we solve an opportunistic UAVs-assisted data transmission problem in a wireless sensor network from a novel perspective.In consideration of UAVs dynamic behaviors,varying transmission tasks and real-time matching between UAVs and sensor clusters,we propose to jointly optimize UAV scheduling and power control aiming to obtain optimal policies to maximize the network data transmission in a long run under the opportunistic access mode.We develop a deep Q-network(DQN)based and a deep deterministic policy gradient(DDPG)based optimization approaches to adjust the power allocation of cluster heads,and the scheduling and bandwidth allocation of UAVs during their missions over the covered area to improve the whole network data transmission performance.Simulation results demonstrate the validity and superiority of our proposed approaches compared with other benchmark policies in different perspectives.The contributions of this paper are summarized as follows:1.We propose a novel model for opportunistic UAVs-aided wireless sensor network consisted of sensor clusters with energy harvesting.Based on our model,we improve the data transmission performance of terrestrial network without deploying additional communication infrastructure or extra UAVs,which efficiently exploits the idle resource of hovering UAVs with their Lo S communication channels.This proposed communication model reduces the transmission interference and the energy consumption.We also propose a K-means based algorithm for the clustering of randomly distributed wireless sensors.All sensor nodes in the covered area form several transmission clusters with heads for the communication among clusters and UAVs for better transmission performance.2.Our optimization problem is a mixed integer non-linear program(MINLP)and we can not solve it directly.We propose to reformulate it as a discrete-time Markov decision process(MDP)based on the real-time information about the amount of transmission task,the harvested energy,the battery power and the channel condition.Under this MDP,we take the deep reinforcement learning method to deal with the reformulated problem.The designed agent in our DRL approach obtains an optimized scheme involving UAV scheduling,bandwidth allocation and cluster head transmit power.The flexible scheduling and proper power control can effectively enhance the performance of the network.Note that we focus on the UAV opportunistic access rather than flight route planning,which is an independent problem to solve.3.By using DRL,we develop a DQN based and a DDPG based algorithms to maximize the data transmission and compare them with four baseline policies to demonstrate the superiority of our proposed approaches from different perspectives.The DQN based algorithm is typical in DRL and well used for the optimization problem with discrete space.Differently,the DDPG based algorithm can take actions from continuous space to significantly improve the system performance and appear easier to implement in practical environment.Simulations results prove that both proposed DQN and DDPG based algorithms are valid and effective for our proposed network.
Keywords/Search Tags:Opportunistic UAV transmission, Wireless sensor network, Resource allocation, Deep reinforcement learning(DRL), Energy harvesting, Scheduling
PDF Full Text Request
Related items