Font Size: a A A

Research On UAV Path Planning Strategy Based On Deep Reinforcement Learning For Data Collection

Posted on:2022-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:F ZhangFull Text:PDF
GTID:2518306497471304Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,unmanned aerial vehicle(UAV)technology has developed rapidly and is used in a wide range of industries.Combining UAVs with wireless sensor networks can effectively enhance the communication,sensing and energy efficiency of networks.The use of UAV-assisted wireless sensor network data collection can effectively avoid long-range and long-cycle data transmission from sensor nodes,reduce the communication losses of sensor nodes and improve the life cycle of the network.Path planning is a central issue that needs to be addressed in the practical application of UAVs.As a mobile intelligent body,UAVs have a limited amount of energy available per flight.The quality of the flight path can affect the time and energy efficiency of the UAV in its tasks.In this paper,we use Deep Reinforcement Learning(DRL)methods to research the UAV path planning strategy to reduce the energy consumption of the UAV and improve its data collection efficiency,against the background of region's pedestrian-intensive sightseeing agriculture.In wireless sensor networks for sightseeing agriculture,the data sources of sensor nodes mainly include environmental data collected in a regular cycle and the cached data generated by the dynamic random access nodes of pedestrians.In this paper,social force model is introduced to obtain the variation of pedestrian flow and the spatial distribution of pedestrians in an area,which supports the analysis of the dynamic data characteristics of sensor nodes,thus building a reliable data model for wireless sensor networks that combines pedestrian flow parameters to support the path planning strategy of UAVs.In this paper,discrete space modelling of the UAV data collection problem is completed based on the Semi-Markov Process.Combining the Semi-Markov-Option hierarchical reinforcement learning method and the rainbow deep reinforcement learning algorithm,the SMO-Rainbow(Semi-Markov-Option-Rainbow)UAV path planning strategy is proposed.The reward function is constructed from the perspective of energy efficiency and data collection efficiency for the set of option actions and specific data collection problems.During the training of the deep reinforcement learning model,the steady improvement of the model performance proves the effectiveness of the constructed reward function.?-greedy is a common exploration strategy in deep reinforcement learning,but it does not effectively balance the exploration and exploitation processes during the training phase of the model.In this paper,we propose an adaptive exploration strategy AT-?-greedy(Adaptive-Tanh-?-greedy)based on the Tanh function.The simulation experiments demonstrate that AT-?-greedy can effectively improve the performance of deep reinforcement learning models in the training phase compared with the fixed-value based ?-greedy strategy.Simulation results show that the path planning strategy proposed in this paper has better performance in terms of task accumulation rewards,data collection efficiency and training stability than other deep reinforcement learning UAV path planning strategies in a sightseeing agriculture scenario.According to the path planning strategy proposed in this paper,it can effectively reduce the flight distance of the UAV and improve the data collection efficiency in the above-mentioned application scenarios.
Keywords/Search Tags:unmanned aerial vehicle, path planning, deep reinforcement learning, data collection, social force model
PDF Full Text Request
Related items