| UAV’s trajectory planning is a key component of the UAVs mission planning system,which is to plan a flight trajectory that satisfies the constraints and improves the success rate of the mission according to the given mission objectives and simulation environment.In this thesis,two three-dimensional space UAV trajectory planning models are established based on the UAV’s constraints and kinetic system,combined with swarm intelligence algorithm and deep reinforcement learning algorithm.One is the UAV static trajectory planning model based on the improved bat algorithm(IBA),and the other is to establish a trajectory planning model for UAVs to avoid multiple dynamic obstacles under the IBA-DDPG-IIFDS algorithm(IDI),and then prove the convergence of the two algorithms.Based on the artificial bee colony algorithm and the bat algorithm,the IBA is proposed.IBA’s convergence is proved and the experimental simulation analysis is carried out.This thesis describes the specific steps of the IBA algorithm in detail,analyzes the state transition probability of the IBA algorithm,and then proves that the population generated by the IBA algorithm is a finite Markov chain and that the IBA algorithm can finally converge to the global optimal solution in probability.Then,thesis conducts the UAV trajectory planning simulation experiment in three-dimensional static space.The parameters of the IBA are simulated several times and compared with other heuristic algorthms to verify that the IBA has better performance.In addition,this thesis verifies the applicability of the IBA algorithm by testing 10 benchmark functions.It shows that the IBA algorithm has better performance than the traditional heuristic algorithm,and can make the UAV plan a safer flight path faster.Based on the IBA algorithm,the deep deterministic policy gradient algorithm(DDPG)and the improved interferometric fluid dynamic system(IIFDS),the IDI algorithm is proposed.The convergence of the DDPG is proved and the UAV trajectory planning model under the multi-dynamic obstacle environment is established for experimental simulation.This thesis describes the steps of the IDI algorithm.According to the markov decision process of the DDPG and the fixed point thorem of the bellman equation,the error propagation and one-step approximation error generated by the DDPG can be proved.Then,it is proved that the DDPG can finally converge to the optimal solution in probability.Furthermore,thesis conducts a simulation experiment of trajectory planning for UAV to avoid multiple dynamic obstacles in three-dimensional space and compares the IDI algorithm with other deep reinforcement learning algorithms.The convergence curve and UAV track route are drawn and a new simulation environment is constructed to test the applicability of algorithm.The simulation results show that the IDI algorithm can train a better neural network and improve the abilith of the UAV to avoid dynamic obstacles.The research results show that in the three-dimensional space static trajectory planning model,the IBA algorithm has strong local search ability.Its convergence speed is increased by 50%,and the optimal solution quality is increased by 40%.In the trajectory planning model for UAVs to avoid multi-dynamic obstacles in three-dimensional space,compared with other deep reinforcement learning algorithms,the IDI algorithm can solve problems better and show better applicability in new environments.Therefore,the research in this thesis can provide guidance for UAV trajectory planning to a certain extent,and can also provide a reference for the research combining heuristic algorithm and deep reinforcement learning. |