| Unmanned Surface Vehicle(USV)is a kind of unmanned surface intelligent platform that integrates autonomous navigation,motion control,and environmental perception.Collision avoidance is one of the core capabilities of its autonomous navigation.Therefore,the research on collision avoidance algorithms for USV is of great significance for the application of underwater exploration,Marine monitoring,water search and rescue,and promoting the development of intelligent technology in the unmanned system.The USV works in the surface environment,which is dynamic and complex.Therefore,the traditional methods to solve the collision avoidance problem of USV in the wave-current interference environment have certain limitations.Based on this,this thesis establishes a high-dimensional state space of perception and drive and uses the method of deep reinforcement learning to carry out the research of the USV collision avoidance algorithm in the wave-current interference environment.Firstly,in view of the lack of a virtual experiment simulation system for USV that is close to real-use scenarios,a virtual simulation experiment system for USV is developed based on the Unreal Engine 4(UE4)engine,which includes a virtual simulation platform and a ground station to support the verification,training,and testing of algorithms in the early stage of USV development.The simple obstacle environment and complex obstacle collision avoidance environment are designed respectively for the training of the USV collision avoidance algorithm.In addition,the virtual simulation experiment system developed in this thesis can also be used to study the algorithm of fixed point maintenance and formation cooperation of USV.Secondly,this thesis focuses on the problem of poor convergence of deep reinforcement learning collision avoidance algorithm for USV under wave current disturbance and highdimensional state space.A Random Walk Policy Twin Delayed Deep Deterministic Policy Gradient(RWTD3)algorithm is proposed.After interacting the exploration actions generated by the random walk policy with the environment,the algorithm stored the experience tuples in the experience pool,which could accelerate the convergence of the algorithm at the beginning of the training,and finally realized the collision avoidance of USV.After training,the algorithm can realize collision-free navigation from any start point to a given end point in the environment of wave current disturbance and complex static obstacles,without off-line trajectory and trajectory point generation.Finally,the simulation results show that the proposed algorithm is easier to converge,and performs better in collision avoidance behavior in complex obstacle environments,which reflects the feasibility and effectiveness of the algorithm.After sufficient training,the USV can achieve smooth collision avoidance. |