| Autonomous underwater vehicles(AUV)have played an important role in military applications,marine resource exploration,hydrological data acquisition,underwater intelligent operations and other fields.Since a single AUV is difficult to meet the growing demand for underwater operations,AUV formation technology came into being.For the application background of dense formation of micro AUVs,this paper aims to study the relative positioning of adjacent robots in multi-robot formation tasks based on visual perception.When GPS positioning cannot be used and underwater visual texture features are limited,it is very challenging to continuously detect and track the pose of adjacent AUVs.The method of using fiducial markers or artificially designed features to simplify the positioning task has the disadvantage of easy loss of features.The deep learning pose estimation method that directly models the appearance of natural features has greater research value.However,data-driven deep learning methods require massive data support,and currently there is no easy-to-use method to collect a large amount of accurate AUV underwater pose data.In view of the problem that AUV pose dataset is difficult to obtain in the underwater scene and the existing deep learning-based pose estimation methods cannot be applied,this paper proposes an AUV visual localization method based on synthetic data.We firstly builds a virtual underwater scene by Unity3 D,and obtains the rendering data of the known pose in the Unity3 D through the virtual camera.Secondly,through the unpaired image translation work,the style transfer of the rendered image to the real underwater scene is realized,and the synthetic underwater pose dataset is obtained by combining the pose information of the known rendered image.Third,based on the Ar Uco markers,the pose truth values of a small number of real pool pictures are calculated,and an indoor pool dataset is constructed.Finally,a convolutional neural networks pose estimation method based on local region keypoint projections is proposed,and the network predicts the twodimensional projection of known reference corners.The resulting 2D-3D point pairs obatin the relative positions and poses through the random sample consensus based Perspective-n-Point algorithm.This paper carried out quantitative and qualitative experiments based on constructed datasets to demonstrate the effectiveness of the proposed method.The experimental results show that the unpaired image translation can effectively eliminate the gap between the rendered image and the real underwater image,and the proposed local area keypoint projection method can perform more effective 6D pose estimation,and the proposed method based on synthetic data provides a novel idea for the problem of visual localization of underwater scenes. |