| With the explosive growth of image data on the Internet,how to accurately and quickly retrieve images of interest to users from massive image data has become a research hotspot in the field of computer vision.In recent years,deep metric methods have achieved good results in image retrieval tasks,but there are still problems such as insufficient feature discrimination ability,difficulty in convergence and time-consuming during training.Based on the framework of deep metric learning image retrieval algorithm,this thesis studies the image retrieval algorithm from the two perspective of feature extraction model and loss function.Whether the image features extracted by the feature extraction algorithm have sufficient representation ability directly determines the accuracy of retrieval.In this thesis,a deep feature extraction network based on deep residual network is designed,and several improvements are made to the network structure for image retrieval tasks,including: reducing the stride of the last convolution layer,concatenating features of different layers,fusing global mean pooling and global maximum pooling,feature dimensionality reduction,etc.,make the network have stronger feature expression ability,thereby improving the retrieval accuracy.The pair-based loss functions that widely used in metric learning need to construct a large number of sample pairs,resulting in a large amount of computation and difficulty in convergence during training.In this thesis,a clustering-based triplet loss function is designed.According to the degree of dispersion of the samples within each class,1 ~ 2 cluster centers are dynamically allocated,and then triplets are constructed in the form of(anchor,positive sample,negative cluster center),reducing the amount of computation in the training phase,and improving the feature discrimination ability of the model.In addition,a cross-batch hard example mining strategy by storing cluster centers is designed,so that the model can perceive the global distribution of all categories in the training set in the embedding space,which further improves the retrieval accuracy.At the end of this thesis,two retrieval optimization strategies are designed for different application scenarios of image retrieval.Aiming at the speed-prioritized image retrieval task,a retrieval acceleration strategy based on clustering is designed.The clustering algorithm is used on the image gallery to reduce the number of features in the gallery,thereby speeding up the retrieval speed.Aiming at the accuracy-prioritized image retrieval task,a retrieval optimization strategy based on test time augmentation is designed.Using augmentation methods such as horizontal flip,rotation,brightness adjustment,etc.,the difference between the query image and the image in the gallery is reduced,thereby improving the retrieval accuracy.This thesis conducts experiments on four public image retrieval datasets,including CUB200-2011,In-Shop,Consumer-to-Shop and SOP.Experiment results show that,the feature extraction model in this thesis can extract features with good invariance,discrimination and abstraction from images,which is very suitable for image retrieval.The clustering-based triplet loss and cross-batch hard example mining can lead the feature extraction model to learn better feature embedding,and achieve accuracy that is ahead of existing methods.In addition,the clusterbased retrieval acceleration designed in this thesis can greatly speed up the retrieval speed with a small loss of retrieval accuracy,and the retrieval optimization based on test time augmentation can effectively improve the retrieval accuracy. |