Parafoil system is composed of a payload and a flexible parafoil canopy,which can accurately deliver the loads to the target point.It is widely used in the area of aircraft recovery,military and civilian airdrop supply.Since the altitude of the parafoil system is monotonically decreasing,its flight time is limited by the initial flight altitude,combined with terrain avoidance,flight time,landing accuracy and other constraints,higher requirements are put forward for the real-time trajectory planning of the parafoil system.The existing trajectory planning methods of parafoil system mostly adopt traditional numerical calculation methods,which need to be recalculated for different initial positions,resulting in long a calculation time.To improve the real-time performance of parafoil system trajectory planning,we introduce deep reinforcement learning into parafoil system trajectory planning.Based on its unique advantages of establishing models in advance and real-time planning,deep reinforcement learning combined with the flight environment and mission of parafoil system,deep reinforcement learning does not need to recalculate the trajectory for different initial positions after establishing the planning model,which greatly improves the real-time performance of trajectory planning.The specific research contents of this paper include the following two aspects:1.Real time trajectory planning of parafoil system based on deep deterministic policy gradient(DDPG)algorithm.To greatly improve the real-time performance of parafoil system trajectory planning,a real-time trajectory planning method based on DDPG is proposed.Firstly,the 3-DOF model of parafoil system is established by analyzing the real flight data of parafoil system.In the optimization process,the complex trajectory constraints are transformed into landing accuracy,terrain avoidance and real-time rewards in the objective function.Finally,the real-time trajectory planning method of parafoil system based on DDPG is compared with the trajectory planning method of parafoil system based on genetic algorithm.The simulation results show that the proposed trajectory planning method has better real-time performance.2.Real time trajectory planning of parafoil system based on improved twin delayed deep deterministic policy gradient algorithm.Although the DDPG algorithm greatly improves the real-time performance of trajectory planning,the landing accuracy still needs to be improved compared with the numerical calculation method.To solve this problem,this paper proposes an improved TD3 algorithm.Pre-evaluation the value of actions through environmental feedback,and dynamically select the scale of added noise,this algorithm can improve the poor global performance caused by the weak randomness of fixed noise in all states,increase the exploration intensity of low value strategy,and maximize the impact accuracy of the planned trajectory when the impact on real-time is not significant.Finally,the detailed simulation results prove the feasibility and correctness of the proposed trajectory planning method.Compared with DDPG and TD3 algorithms,the landing accuracy and success rate of this method are greatly improved. |