| Visual tasks such as scene perception are the prerequisites for UAV and other unmanned equipment to effectively complete various difficult tasks.The completion of basic tasks such as object detection directly affects the work efficiency of unmanned equipment.Due to the complexity of the water surface environment,there is a great possibility that the shooting process is carried out under the condition of poor lighting conditions,which makes the detection of water surface targets difficult.In addition,there is a high possibility of using the camera to shoot at the same time noise will be introduced,and the detection performance of general target detection algorithms will be limited under the conditions of external environment or noise interference.On the other hand,it is difficult to obtain high-quality water surface target samples.Currently,the data sets available for water surface target detection research are small in scale and have problems with uneven sample distribution.Moreover,it is also very difficult to obtain high-quality label data,a large amount of human and financial resources cannot absolutely guarantee the label quality,and the final detection effect is also related to the label accuracy and the feature extraction ability of the network.In terms of hardware,the computing equipment that can be carried by industrial unmanned ships is limited and cannot cope with a large number of computing tasks.Therefore,effectively reducing the amount of neural network parameters and calculations is also one of the goals to achieve surface target detection.In response to the above problems,this paper is committed to making full use of the unlabeled data set to improve the feature extraction ability of the water surface target detection network on the basis of effectively eliminating the impact of abnormal lighting and camera noise on the water surface target,and at the same time using a small amount of label adjustment network to achieve better detection performance,and use effective model compression technology to simplify large-scale networks to meet the detection and calculation requirements of mobile platforms such as unmanned ships.The main research content of this paper is as follows:(1)In view of the problem that the water surface target is greatly affected by the light and will be introduced by the random noise of the camera,this paper redesigns an image restoration network that combines the autoencoder and the generative adversarial network,including the autoencoder for the damaged There are two stages: image reconstruction and generating confrontation network to verify the generated image.The self-text encoder uses the "Encoder + Decoder" structure to reconstruct the damaged low-quality image,uses the "discriminator" in the generative confrontation network to verify the reconstructed image,and uses the high-quality image as the training target.Network training,so that the training process of the entire network reaches the Nash equilibrium in the macro sense.This method can restore low-quality images on the basis of taking into account the characteristics of relatively far away locations.By comparing the changes in the three evaluation indicators of mean square error,peak signal-to-noise ratio,and structural similarity during the restoration process,the image after restoration can be evaluated.The quality of the image laid the foundation for the subsequent improvement of the performance of the water surface target detection model.(2)In view of the problem that the water surface target data set cannot be marked with a large number of high standards,this paper proposes a water surface target detection model based on self-supervised learning,including self-supervised network training using unlabeled data for feature extraction and using different numbers of labeled There are two stages of supervised training on the data.On the basis of the restoration of the damaged water surface image,the focus of the self-supervised learning network for the downstream task of water surface target detection is to use a large amount of pseudo-label information to focus on the local features of the foreground,and use a special negative sample construction method to assist the network The training process of the image correctly understands the scene information in the image,effectively solves the entanglement problem between the foreground and the background,and lays a solid foundation for the high-quality completion of the frame selection and category prediction tasks of water surface targets.Parameter fine-tuning is to use a very small amount of labeled data to train the category prediction process and box position regression process of the overall network,so that the network has accurate target detection performance,and on this basis,it can effectively achieve the detection level of supervised learning.Reduce the cost of manual labeling.The final detection performance is as follows: precision 84.39%,recall 91.79%,m AP@0.593.89%,surpassing the unpretrained supervised learning network in all aspects.(3)In view of the insufficient computing power and limited memory resources of mobile platforms such as unmanned ships,this study adopted a knowledge distillation model compression algorithm for the two-stage surface target detection network: Task Adaptive Regularization Knowledge Distillation algorithm(TARD),including the effective distillation of the three positions of the backbone extraction network,the category prediction head and the bounding box regression head,reduce the amount of parameters and calculations by 51.95% and 22.22% respectively,and reduce the detection precision,recall,m AP@ 0.5 The three performance indicators increased by3.54%,6.40%,and 5.65% respectively.The inference speed of the network after training increased by 48.16%,which improved the performance and training efficiency of the simple and small surface target detection network. |