Font Size: a A A

Flight Path Planning Algorithm Based On Dynamic Programming And Reinforcement Learning

Posted on:2019-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z X ZhangFull Text:PDF
GTID:2392330611993283Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays,UAV path planning is one of the hotspots of research.With the rapid development of existing hardware and artificial intelligence theory,the degree of intelligence of single UAV is greatly improved,and the leap-forward development from single UAV intelligence to cooperative intelligence is realized.In practical applications,UAVs are widely used in all aspects of social production and life,especially in reconnaissance missions.In the process of reconnaissance mission,due to the limited size of UAV,UAV has great limitations in energy supply,computing load,mobility and communication capabilities.Unmanned aerial vehicles(UAVs)are characterized by limited energy sources,insufficient real-time computing capability,and the existence of blind areas and upper bounds(compared with mission areas).Therefore,reasonable planning of UAV path for reconnaissance mission becomes an important research issue in the complex environment with large-scale and sparse target distribution.This paper deduces and analyzes the existing problems of UAV path planning algorithm,and then designs an improved algorithm to improve the performance of the algorithm in the aspects of computational complexity and terrain adaptability.It provides theoretical and algorithm reference for solving the path planning problem in practical reconnaissance missions.First of all,this paper improves the modeling method of existing algorithms.Aiming at the extensibility of existing algorithms for task modeling,a new map modeling method for UAV reconnaissance strategy is proposed in this paper.The traditional path planning algorithm uses the location of UAV as the modeling method of algorithm state space,and then generates the irreversible strategy based on the iterative results.The result is that,the paths may converge to infeasible solutions when dealing with complex terrain environment problems such as valleys.In this paper,the modeling method of directional motion mapping from grid to state space is proposed,and the motion state space is constructed.Simulation results show that the method can effectively improve the description ability of the algorithm in complex terrain environment,and realize the repeated access of UAV to a single area.Secondly,the algorithm of computing dimension reduction based on hierarchical map grid method is proposed.In the traditional algorithm,the precision of the algorithm is positively correlated with the accuracy of the map mesh division,and the requirement of high precision brings dimension explosion problem.In this paper,a hierarchical map gridding method is proposed.The local interest points are processed by the lower level algorithm and the global planning is carried out by the upper level algorithm based on the calculation of multi-level algorithm by coordination.This method greatly reduces the computational time of the algorithm.Taking the proposed dynamic programming algorithm based on direction determination as an example,the computational time can be saved as much as 90% under the same computational scale.This paper presents a dynamic programming algorithm based on direction judgment and a Q-learning algorithm for UAV reconnaissance strategy planning.On the basis of the new modeling method proposed in this paper,it is introduced into the dynamic programming algorithm,and the core part of the algorithm is modified,and the iterative space is changed from state space to action space.Simulation results show that the new algorithm has the advantages of better environmental applicability and shorter calculation time under the same calculation accuracy.Because the dynamic programming algorithm relies heavily on the acquisition of prior information of mission environment,this paper proposes a Q-learning UAV reconnaissance strategy planning algorithm based on direction judgment.The algorithm also adopts the new modeling method and dimension reduction method proposed in this paper.Simulation results show that the algorithm has better terrain adaptability and obvious computing time advantages than the traditional Q-learning algorithm.Finally,this paper summarizes the main work and innovation points,and points out the direction for further research.
Keywords/Search Tags:Motion state space, Hierarchical map gridding method, Computational complexity, Dynamic programming, Reinforcement learning, Q-learning, Environmental adaptability
PDF Full Text Request
Related items