With the development of China’s marine economy and the continuous upgrading of national security strategy,research on underwater target identification technology has become increasingly active.Due to the highly complex underwater environment and diversity of underwater targets,underwater target identification technology faces great challenges,and its recognition accuracy is still not ideal.The acquisition and annotation of underwater datasets are also restricted by various factors.Therefore,conducting high-precision research on underwater target identification technology is an important topic that meets the strategic development needs of China,and has important theoretical value and practical significance in both military and civilian fields.Self-supervised feature learning methods include generative and contrastive approaches.Generative self-supervised feature learning methods only focus on pixel space features,while contrastive self-supervised feature learning methods can focus on more abstract latent semantic information.Therefore,this paper is dedicated to in-depth research on contrastive selfsupervised feature learning methods to improve the performance of underwater target identification tasks.This paper propose a self-supervised feature learning method based on a dual-channel self-attention audio encoder(DSAE),which unifies Mel filter-bank(FBank)features with rich low-frequency information and Gammatone filter-bank(GBank)features that focus on high-frequency signals in contrastive self-supervised feature learning.This enables the encoder to learn advanced semantic features that combine the advantages of FBank and GBank features.To enhance the network’s ability to select information,we use a local selfattention mechanism in the underwater target feature extraction module to better extract semantic information from local features.On this basis,in order to improve the recognition accuracy and robustness of DSAE selfsupervised feature learning method in downstream tasks,a dual-channel self-attention audio encoder with dynamic positive sample memory module(DSAE-DMM)underwater target identification method is proposed.we introduce data augmentation and positive-negative sample balance strategies.We use a time-frequency enhancement strategy to increase the diversity of data samples and enhance robustness.Furthermore,we propose to construct a dynamic positive sample memory module to expand historical temporal embedding vectors as positive samples,and dynamically update them,allowing the model to better learn the longterm dependencies between data and balance the proportion of positive and negative samples to improve recognition accuracy.Through experimental results analysis of underwater target datasets for recognition performance,convergence speed,and noise resistance,we verify that the DSAE-DMM underwater target identification method proposed in this paper has good recognition accuracy and convergence speed.Additionally,our proposed method shows strong robustness to real environmental noise and different levels of artificial noise interference,indicating potential and value in underwater target identification tasks. |