Flight Path Planning Algorithm Based On Dynamic Programming And Reinforcement Learning

Posted on:2019-12-01

Degree:Master

Type:Thesis

Country:China

Candidate:Z X Zhang

Full Text:PDF

GTID:2392330611993283

Subject:Computer Science and Technology

Abstract/Summary:

Nowadays,UAV path planning is one of the hotspots of research.With the rapid development of existing hardware and artificial intelligence theory,the degree of intelligence of single UAV is greatly improved,and the leap-forward development from single UAV intelligence to cooperative intelligence is realized.In practical applications,UAVs are widely used in all aspects of social production and life,especially in reconnaissance missions.In the process of reconnaissance mission,due to the limited size of UAV,UAV has great limitations in energy supply,computing load,mobility and communication capabilities.Unmanned aerial vehicles(UAVs)are characterized by limited energy sources,insufficient real-time computing capability,and the existence of blind areas and upper bounds(compared with mission areas).Therefore,reasonable planning of UAV path for reconnaissance mission becomes an important research issue in the complex environment with large-scale and sparse target distribution.This paper deduces and analyzes the existing problems of UAV path planning algorithm,and then designs an improved algorithm to improve the performance of the algorithm in the aspects of computational complexity and terrain adaptability.It provides theoretical and algorithm reference for solving the path planning problem in practical reconnaissance missions.First of all,this paper improves the modeling method of existing algorithms.Aiming at the extensibility of existing algorithms for task modeling,a new map modeling method for UAV reconnaissance strategy is proposed in this paper.The traditional path planning algorithm uses the location of UAV as the modeling method of algorithm state space,and then generates the irreversible strategy based on the iterative results.The result is that,the paths may converge to infeasible solutions when dealing with complex terrain environment problems such as valleys.In this paper,the modeling method of directional motion mapping from grid to state space is proposed,and the motion state space is constructed.Simulation results show that the method can effectively improve the description ability of the algorithm in complex terrain environment,and realize the repeated access of UAV to a single area.Secondly,the algorithm of computing dimension reduction based on hierarchical map grid method is proposed.In the traditional algorithm,the precision of the algorithm is positively correlated with the accuracy of the map mesh division,and the requirement of high precision brings dimension explosion problem.In this paper,a hierarchical map gridding method is proposed.The local interest points are processed by the lower level algorithm and the global planning is carried out by the upper level algorithm based on the calculation of multi-level algorithm by coordination.This method greatly reduces the computational time of the algorithm.Taking the proposed dynamic programming algorithm based on direction determination as an example,the computational time can be saved as much as 90% under the same computational scale.This paper presents a dynamic programming algorithm based on direction judgment and a Q-learning algorithm for UAV reconnaissance strategy planning.On the basis of the new modeling method proposed in this paper,it is introduced into the dynamic programming algorithm,and the core part of the algorithm is modified,and the iterative space is changed from state space to action space.Simulation results show that the new algorithm has the advantages of better environmental applicability and shorter calculation time under the same calculation accuracy.Because the dynamic programming algorithm relies heavily on the acquisition of prior information of mission environment,this paper proposes a Q-learning UAV reconnaissance strategy planning algorithm based on direction judgment.The algorithm also adopts the new modeling method and dimension reduction method proposed in this paper.Simulation results show that the algorithm has better terrain adaptability and obvious computing time advantages than the traditional Q-learning algorithm.Finally,this paper summarizes the main work and innovation points,and points out the direction for further research.

Keywords/Search Tags:

Motion state space, Hierarchical map gridding method, Computational complexity, Dynamic programming, Reinforcement learning, Q-learning, Environmental adaptability

Related items

1	Multi-agent Reinforcement Learning With Hierarchical Game Model For Coordinated Control Of Regional Road Network Signals
2	Improvement Of Q-learning Reinforcement Learning Algorithm And Its Application In Unmanned Vehicles Path Planning
3	Research On Decision Making Method Of Maritime Combat Simulation Based On Deep Reinforcement Learning
4	Research On AUV Behavior Replanning Method Based On Reinforcement Learning
5	Research On Automatic Parking Lot Scheduling Method Based On Reinforcement Learning
6	Research On Cellular-connected Unmanned Aerial Vehicle Path Planning Based On Deep Reinforcement Learning
7	Autonomous Behavior-learning And Planning Of AUV Space Motion
8	Research On AUV Path Planning Method Based On Hierarchical Reinforcement Learning
9	Deep Reinforcement Learning For Smart Generation Control Of Power Systems
10	Research On AUV Motion Control Method Based On Reinforcement Learning