Obstacle avoidance planning technology is one of the important technologies in the field of AUV,which runs through AUV’s navigation.This paper focuses on the research of obstacle avoidance technology in complex and dynamic environments,taking a patrol AUV as the research object,according to the type of AUV to perform fixed water depth comb-shaped search inspection task and the process of AUV recovery in the dynamic unknown underwater environment,the whole process of the engineering application of AUV with dynamic obstacle avoidance problem for typical research background.Deep reinforcement learning(DRL)method is applied to AUV obstacle avoidance method to improve AUV’s obstacle avoidance decision-making ability in the dynamic underwater environment and and ensure the safety of AUV in the process of performing tasks.The main research contents are as follows:Firstly,this paper proposes an end-to-end AUV obstacle avoidance frame structure based on DRL,which organically combines AUV mathematical modeling,sensor systems,obstacle avoidance methods,deep reinforcement learning systems,and motion execution control systems.This framework will be used to guide the design of dynamic obstacle avoidance methods under specific engineering tasks later in this paper.Secondly,this paper proposes to add AUV obstacle avoidance sensing end between sensor detection data and obstacle avoidance planning,consider the measurement error of AUV sensor underwater observation results,Interacting Multiple Model-Extended Kalman Filter(IMMEKF)algorithm is used to remove noise and estimate and predict the movement state of detected obstacles,which improves the reliability of obstacle avoidance method.And experiments show that the IMM-EKF obstacle state prediction estimation algorithm can effectively adapt to the obstacle movement change compared with Extended Kalman Filter(EKF)algorithm.It greatly improves the prediction accuracy of the motion state of mobile obstacles,and meets the requirements for the prediction accuracy of obstacle avoidance behaviors.It can provide more accurate input of obstacle perception for the design of autonomous obstacle avoidance method of AUV in the process of fixed depth comb-shaped search inspection task and autonomous recovery.Thirdly,aiming at the inspection of the AUV carrier in this paper,in the fixed depth combshaped inspection task in a dynamic unknown underwater environment,in order to solve the problem of horizontal obstacle avoidance planning under the complex,multi-task,and multiconstraint conditions of this kind of scene,this paper combines the Deep Q Network(DQN)algorithm and proposes A method for autonomous decision-making of AUV tasks based on multi-behavior network calls,and under the framework of end-to-end obstacle avoidance based on DRL,a two-dimensional dynamic obstacle avoidance decision method for AUV based on Deep deterministic policy gradient – Proportional Integral Derivative(DDPG-PID)algorithm is proposed.Finally,the simulation results show that this method can effectively solve the problem of horizontal obstacle avoidance planning,and the single-step obstacle avoidance decision time is less than 0.5 seconds,which improves the safety of AUV comb-shaped inspection task.Finally,in the process of autonomous recovery of AUV in three-dimensional unknown underwater environment,the traditional obstacle avoidance method is difficult to deal with the surge of perceptual information data caused by the sharp increase in the number of spatial grids.Based on the DRL end-to-end obstacle avoidance framework structure,the method of horizontal autonomous obstacle avoidance decision-making process is improved,and a three-dimensional space obstacle avoidance behavior learning and training system based on deep reinforcement learning is proposed.The core of the system is the 3D autonomous obstacle avoidance decisionmaking method based on Sum Tree-Deep Deterministic Policy Gradient(Sumtree-DDPG)algorithm.Compared with DDPG algorithm,this method has better effect in the learning process of obstacle avoidance behavior training and can obtain effective obstacle avoidance strategy.Simulation results show that,compared with the improved DDPG algorithm and the traditional artificial potential field algorithm,the 3D autonomous obstacle avoidance decision method based on Sumtree-DDPG algorithm can effectively guide the AUV to dynamically avoid obstacles in the complex scenes of different obstacles with the interference of ocean currents,and can guarantee the safety of the AUV autonomous recovery process.And through the index verification,it has high feasibility and application value in engineering application. |