The high-speed growth of computer technology and hardware facilities has promoted the continuous increase of the amount of information that can be mined in digital images,and image processing and applications have begun to attract people’s attention.Especially in remote sensing image retrieval,crop seed classification and detection,road crack detection and other applications,accurate analysis of the information contained in the image will provide users with great help.Because of the development and progress of computer technology,people are trying to let computers learn "how to understand the images they see",which is a unique skill for humans to understand the world.Therefore,a concept,namely computer vision,has emerged.In this task,the image needs to be analyzed,and the most necessary is vector extraction of samples.This paper starts from the two application fields of remote sensing image retrieval and pavement crack detection in computer vision,and improves the algorithm on the basis of the current existing methods.Efficiency and accuracy of vision tasks.The research content of this paper:1.In view of the large difference in sample similarity in different categories,a deep metric learning method based on Similarity Retention Loss(SRL)is proposed.The existing metric learning methods are improved from the aspects of sample mining,network model structure,metric loss function and so on.In terms of sample mining,difficult samples and simple samples are redefined,and appropriate positive samples and negative samples are mined respectively according to the size of the sample set and the spatial distribution of the dataset classes.At the same time,the concept of similarity preserving loss is proposed.According to the number of difficult and easy samples in the category,different learning coefficients are assigned to the selected difficult samples,and the spatial structure characteristics of the samples of the same category are learned in this way.The negative samples are given different coefficients according to the spatial distribution of samples within a uniform range in space,so as to maintain the similarity spatial structure characteristics.2.For the problem of difficulty in distinguishing samples due to small differences between classes in the feature extraction process,combined with the two loss function design ideas of deep metric learning(ie,structural loss and result-based loss),a globalaware ranking loss(Global-aware Ranking Loss)is proposed.Loss,GRL)model,which is a global optimization model based on feature space and retrieval candidate list.And it is proposed that for each class in each dataset,the similarity between the samples within the class is different,so the samples and the number of samples that need to be learned should also be different.Intra-class Space Sample Mining(ISSM)is proposed,which selects dislocation samples according to the distribution of samples,instead of artificially setting the boundary threshold of sample mining as usual.3.For the more difficult computer vision task,namely automatically extracting and distinguishing cracks from complex backgrounds,an Octave U-Net network model combined with Residual attention mechanism is proposed,so that it can still be done when the foreground and background are not balanced.Combining Octave convolution and Octave transposed convolution,and selecting Residual attention module auxiliary model according to network characteristics to capture different types of image features,at the same time,it can prevent U-Net network from losing underlying semantic information with the increase of network depth.And an improved weighted crossentropy loss is proposed,which is both stable and solves the class imbalance problem. |