| As a new tool to explore the ocean,the research on intelligent application of Unmanned Surface Vehicle has been widely concerned.Autonomous navigation technology of Unmanned Surface Vehicle is the key to realize intelligent.With the rapid development of artificial intelligence,especially reinforcement learning,it provides a new direction for solving the path planning problem of Unmanned Surface Vehicle.In this paper,reinforcement learning is improved and applied to the global path planning and by combining it with the local path planning algorithm,we can ensure that Unmanned Surface Vehicle can avoid obstacles effectively and reach the target point smoothly,and at the same time plan an optimal path.The main research work is as follows:First of all,through the research on the current situation of the development of Unmanned Surface Vehicle,the global path planning algorithm and the local path planning algorithm,the research background and significance of the subject are clear.The mathematical model of this paper is modeled,the basic theory of obstacle avoidance is introduced,and the basic principle of reinforcement learning adopted and proposed in this paper is briefly described.Secondly,an improved Q-learning path planning algorithm is proposed for the global path planning of Unmanned Surface Vehicle.Aiming at the problem of exploring and utilizing imbalance in Q-learning algorithm,a method of dynamically adjusting ?-greedy random strategy parameters is proposed.By considering the success rate,the algorithm can dynamically adjust exploration factors ? according to different stages of learning,so as to meet the balance problem of exploration and utilization in different stages of learning.Using the idea of shaping,the potential field model is established according to the known information.The potential field value of the target point is the maximum,and the potential field value of the obstacle is zero.The state far away from the obstacle and close to the target point has a larger potential field value.The potential field difference is used as an additional reward of the return function to accelerate the convergence speed of the algorithm.Then,in view of many uncertain factors such as the wide sea area,the unstable shape of obstacles and so on,the Q-learning algorithm which uses the q-table to learn will have the situation that the calculation quantity will increase sharply and the dimension will explode.The Deep Q Network is proposed to be applied to the path planning of Unmanned Surface Vehicle.Using neural network instead of q-table solves the problem that the q-table is too large and takes up too much memory when the state action is too many.At the same time,through the training of neural network,the algorithm has a certain generalization ability and enhances the adaptability to the environment.Priority-based sampling can effectively distinguish the importance of different samples.Another neural network with the same structure as the neural network for calculating Q value is used as the target network for calculating Q value to speed up the algorithm learning process.In the state of emergency obstacle avoidance,choosing actions to avoid obstacles based on heuristic knowledge can provide more obstacle avoidance data for neural network training and improve learning efficiency.By comparing this algorithm with RRT algorithm,it is proved that the Deep Q Network is reasonable to deal with the path planning problem of Unmanned Surface Vehicle.Finally,an improved Dynamic Window Approach is proposed to solve the path planning problem of dynamic obstacles encountered in the course of navigation.Because the weight factor of evaluation function plays a decisive role in the effect of avoiding obstacles,the weight parameter of evaluation function is adjusted in real time according to different situations through fuzzy reasoning.Compared with the traditional algorithm,the improved algorithm can adapt to more environments.Then it combines with the global path planning method to generate a global optimal collision free path. |