| The rapid development of artificial intelligence has greatly changed the world in recent years.Meanwhile,more and more attention has been paid to the autonomy of vehicles,ships,aircrafts and other transportation tools.Compared with other transportation vehicles,it is relatively difficult for large commercial ships to achieve autonomous navigation,due to their huge inertia,under-actuated,changeable navigation conditions,and complicated rules.In fact,the navigation of ships is a complicated problem,which includes environmental perception,path planning,collision avoidance,and robust control.The machine reproduction of this process is a hot topic in the field of artificial intelligence and intelligent navigation.Furthermore,collision avoidance is the core problem of intelligent navigation,which requires comprehensively consider multiple factors such as external environment,internal constraints,and accumulation of experience.It is a complex human-conscious decision-making problem.The traditional collision avoidance methods have some limitations in various complex scenarioss.As a result,it is necessary and urgent to find new solutions.The deep reinforcement learning method is an effective way to simulate human adaptability which gradually generates intelligence through the interaction between the agent and the environment.Therefore,this thesis makes use of the deep reinforcement learning as the basic framework to address the problem of autonomous collision avoidance.In addition,it provides new ways for bionic modeling of driving consciousness,which has important theoretical and practical significance.The main achievements are as follows:(1)Research on the consciousness modeling of route optimization based on artificial potential field(APF).The aim of route optimization was to ensure the safety of navigation on a long-distance scale.This process is an important preparatory work for ship collision avoidance.This thesis analyzed the reason of the formation of the customary route,and made use of the APF model to describe it.By analyzing the ship AIS data and non-linear optimization,the APF parameters of a customary route can be learned.Then,the proposed method was verified by ferries of the Yanda route.(2)Research on the static collision avoidance based on dynamic time warping(DTW)and reinforcement learning(Q-Learning).The static obstacles and the distribution of restricted navigation area were fully considered to carry out path planning basing of a selected route.Q-learning and unsupervised learning were used to build up the self-awareness of this process.According to the DTW algorithm,an improved Q-learning method was proposed to simulate the balance of multi factors and improve the generalization of the results.Finally,this method was compared with A*,rapid exploring random tree(RRT)and basic Q-learning to verify the effectiveness through a variety of scenarios.(3)Research on the adynamic collision avoidance for one agent based on a proposed APF-based deep reinforecement learning.Dynamic obstacles are the major threat in the navigation of ships in practice.Firstly,a dynamic collision avoidance method based on DQN in a discrete action space was proposed.The agent extracts the high-dimensional features of the navigation scene image through convolution neural network,while establishing connection between the features and steering action through repeated training.By learning to avoid the collisions after sufficient training,the agent obtained the dynamic collision avoidance intelligence.Subsequently,the Deep Deterministic Policy Gradient(DDPG)model was introduced to address the problem of dynamic collision avoidance in continuous action space.These method performed well after hours of trainning in simulation tests and demonstration applications,which might be very practical in realy ships.(4)Research on the cooperative collision avoidance of multi-agents based on deep reinforcement learning.The collision avoidance among ships in the real world relies on the compromise,negotiation and balance based on navigation regulations.To simulate such a process,the research chose several typical encounter scenarios such as head-on,overtaking and crossing,which were used to discuss how multi agents to realize fully cooperative,fully competitive and transition between cooperation and competition with the help of deep reinforcement learning.Furthermore,a variety of typical multi ship encounter scenarios were constructed and tested,and the proposed cooperative collision avoidance intelligence was verified and validted in simulation experiments.The proposed approaches of this thesis have been partially successfully applied in the obstacle avoidance A.I.system of Nanjing Banqiao ferry,which performed satisfactorily. |