| Designing a reasonable public route for drones can ensure effective supervision of drone group flight,reasonable setting of airports and logistics transfer points,and ensure the safety of low-altitude airspace and the flight safety of drones themselves.However,in reality,it is expected that UAVs traveling on public routes are susceptible to various factors when performing flight missions,causing them to deviate from the preset public routes.In order to minimize the adverse effects of deviated UAVs outside the public route and strengthen the ability to deal with accidental deviations,it is necessary to ensure that the UAVs return to the public route from the offset point as soon as possible.Deep reinforcement learning is currently the key technology and important development trend for autonomous decision-making and intelligent control of unmanned aerial vehicles in autonomous navigation problems,and has achieved certain practical application results in the field of unmanned driving.It has excellent real-time,autonomy and Feasibility meets the requirement of searching for an effective way for UAVs to return to public routes.Therefore,based on the deep reinforcement learning algorithm,this paper studies the UAV regression to the public route method,proposes and designs a UAV regression to the public route model that meets the experimental requirements,and obtains the training and test results under the three algorithms.Theoretical and practical significance.The full text mainly completes the research objectives from three aspects:(1)Combining wind factors,this paper proposes and designs a model of UAVs deviating from public routes affected by wind.Complete the simulation study of the UAV’s deviation from the public route under the wind,and obtain the relevant offset point data required for the UAV to return to the public route model.(2)Based on the determined offset point data,this part clarifies the relative position of the offset point and the public route in the environmental model.Through comparative analysis and selection of grid method and OPENAI GYM as model design methods and tools,this paper designs an environmental model including offset points,public routes,and static and dynamic obstacles.(3)Based on the designed UAV regression model that meets the requirements of the problem,this paper uses DQN,DDPG and PPO algorithms to complete the training and test experiments of the problem model,and obtains the UAV regression method based on deep reinforcement learning.The completion of the test task is above 95%.The experimental results show that the UAV based on the deep reinforcement learning algorithm can well return to the public route from the offset point,thereby ensuring the ability to deal with the deviation phenomenon.The trained agent has good real-time and autonomy,and can adapt to complex obstacle environments.In the problem model,the PPO agent has the best training performance and can achieve the required training goal through the least number of trainings. |