In recent years,unmanned surface vessel(USV)has developed rapidly and become an important force on the battlefield in the future.In the marine environment full of games,the threats faced are increasing,and incidents of non-cooperating ships infringing maritime rights and interests have occurred from time to time.How to design effective strategies and precise tactical methods under the complex marine environment and the movement characteristics of non-cooperative ships with concealment and randomness,so as to realize the recognition of the movement intention of USV to sea target,and complete the tracking of target is an urgent problem to be solved.This paper proposes a solution to the problems existing in the existing technology or algorithm.The main research contents are as follows:Firstly,this paper designs the process of target motion intention and target tracking.The environment is modeled from the perspectives of USV,targets and obstacles,and the motion parameters of USV and targets are expressed in formulas.Secondly,this paper proposes a deep deterministic policy gradient(DDPG)based maritime target motion intention recognition algorithm based on prioritized experience replay(PER)was proposed.According to the importance of the update of network parameters by experience,the priority of sampling is marked.When the DDPG algorithm is trained,the experience with high priority has a high probability of being sampled,and the quality of the experience data used for network training is improved.The neural network is used for fitting,which solves the problem that traditional tabular reinforcement learning algorithms are difficult to deal with high-dimensional problems.Through the simulation experiment,the success rate of the motion intention recognition of the USV and the curve of the planning process are given.The effectiveness and superiority of the above methods in the problem of intent recognition are verified.Finally,this paper proposes a marine moving target tracking decision method based on Soft Actor-Critic(SAC)algorithm.On the basis of cumulative reward,algorithm training introduces maximum entropy to increase the randomness of the strategy,so that the unmanned boat has a stronger ability to explore the environment.At the same time,the algorithm is able to find a better solution on the problem of multimodal reward functions,which corresponds to the tracking problem.Simulation results show the tracking trajectory error curve and cumulative reward curve,which verify the effectiveness of the algorithm in target tracking.This paper also designs the algorithm simulation verification platform.Using object-oriented programming method,the problem is modularized.At the same time,a user graphical interface is constructed to display the dynamic process of the unmanned surface vessel performing tasks,and quantitatively display the status information of the USV and the target. |