Font Size: a A A

Research On Target Tracking Of Unmanned Rescue Ship Based On Deep Reinforcement Learning

Posted on:2021-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhengFull Text:PDF
GTID:2392330602990955Subject:Marine Engineering
Abstract/Summary:PDF Full Text Request
With the accelerated implementation of the ocean power strategy and the rapid development of the marine economy,the maritime industry is increasingly prosperous,maritime activities are becoming more frequent.At the same time,various types of maritime accidents occur at sea.With the development of unmanned ships,maritime unmanned rescue technology is also received extensive attention.In this paper,the unmanned ship is applied to the maritime rescue scene.Under the condition that the unmanned ship obtains the position of the distress target,the driving decision model of the unmanned rescue ship independently tracking and approaching the drift distress target is studied.Aiming at the different mumbers of unmanned ships participating in rescue operations,the single-rescue ship target tracking driving decision-making model and the multi-rescue ship cooperative tracking driving decision-making model are studied separately.Collaborative tracking involves the issues of coordinated task assignment and coordinated collision avoidance.This article analyzes and researches this problem from the perspective of reinforcement learning.The difficulty lies in the construction of an environment platform for training algorithms.Due to the danger of training algorithms in the real environment,this paper builds a maritime rescue physical simulation platform based on ROS and Gazebo,and performs scene simulation on the rescue environment.Considering the advantages of good.sailing stability of catamarans and spacious decks for carrying more rescue equipment,this article takes the catamaran as the research object,loads the catamaran robot model in Gazebo.In order to achieve motion control of unmanned ships,a communication network is created based on ROS to transmit driving instructions.In the single ship rescue scenario,the target tracking process is described by the Markov Decision Process,and the environment state space,action space,and reward function are defined.The DDPG deep reinforcement learning algorithm with an experience playback mechanism is introduced to train and optimize the driving decision model.The model training sample data comes from Gazebo,which is the target tracking driving behavior data of the interactive sampling of the unmanned ship and the environment.With the accumulation of training sample data,the reward function guides the algorithm to converge to obtain the optimal driving decision model,so that the unmanned rescue ship has the ability to track drifting distress targets autonomously.The trained driving decision model is applied to a multi-vessel rescue scenario.Experiments show that although each ship can track a drifting distress target,the task allocation mechanism is relatively rigid during the tracking process,and rescue ships cannot avoid collisions.The MADDPG algorithm is proposed to solve the coordination problem in the tracking process,design the local environment state space,action space and global reward function,train the algorithm based on the multi-ship rescue two-dimensional plane scene,and obtain the cpllaborative rescue strategy model.After the algorithm has converged,the results of the model test show that the two rescue ships can coordinate the assignment of task targets in the process of tracking the targets,and have the ability to avoid collisions,which verifies the synergistic effect of the algorithm.This article discusses the problem of target tracking in different rescue scenarios from the perspective of reinforcement learning,and verifies the feasibility of the algorithm through experiments.It has certain guiding significance for the study of how ships can autonomously track and track drifting distress targets in real unmanned rescue projects at sea.
Keywords/Search Tags:Unmanned ship, Maritime rescue, Deep reinforcement learning, Target tracking, Multiple cooperating ship
PDF Full Text Request
Related items