| Underwater target recognition is a prerequisite for marine unmanned detection equipment such as unmanned aerial vehicles and unmanned boats to effectively complete various specific tasks.It identifies underwater targets quickly and accurately through processes such as preprocessing,representation learning and target recognition.This technology has important theoretical and application value in both the detection field and the equipment field.The intricate environment makes it costly to obtain labeling high-quality target samples high-quality.And It’s very inefficient to use a single class of representation extracted for the target sample in the marine scene.So,this thesis proposes a representation contrastive enhancement underwater target recognition method based on Acoustic-Embedding Memory Unit Modified Space Auto-Encoder(AEMU-SAE)from the perspective of self-supervised acoustic representation learning and enhancement,committing to making full use of existing data sets to improve the performance of underwater target recognition tasks,and exploring the application of self-supervised learning technology in the field of underwater target recognition.First,this paper proposes a self-supervised acoustic representation learning method based on Space Auto-Encoder(SAE)which combines the voiceprint distinguishing advantage of Mel Filter-bank(FBank)and the robust noise immunity advantage of Gammatone Filter-bank(GBank)by the animal-like vocal auditory system.The method constructs good voiceprint characteristics and robustness against noise Space Auto-Encoder Spectrogram(SAE Spec)through completing the spatial conversion and reconstruction from FBank to GBank.On this basis,due to the lack of SAE algorithm for negative sample learning,it is short of high-level semantic information.A negative sample mining strategy based on Acoustic-Embedding Memory Unit(AEMU)is proposed,which uses a dynamic queue dictionary to dynamically update and store negative samples.This method not only uses the adversarial loss function to unify the positive sample learning and the negative sample learning in one self-supervised learning framework,but also can fully consider the negative sample mining and computational overhead in the self-supervised acoustic representation learning process.The AEMU-SAE can ensure that acoustic representation containing high-level semantic information are learned while maximizing the use of GPU computing resources.A large number of experimental results show that the Acoustic-Embedding Memory Unit Modified Space Auto-Encoder(AEMU-SAE)method studied in this paper has good performance in recognition accuracy,task adaptability,and robustness against noise.The potential and value of research on underwater target recognition methods based on self-supervised acoustic representation learning in practical applications. |