This paper studies a classic problem in the field of image retrieval.This paper compares the work at home and abroad in recent years,and puts forward its own improvement on this problem.This problem has important applications in the research of computer vision and the retrieval recommendation system of Internet enterprises.In this paper we propose a new method based on deep learning to deal with the problem of image retrieval,mainly consist of 2 contributions.Firstly,we come up with a new kind of loss for image retrieval.And secondly,we build a new network architecture for extracting descriptors for images.The principal idea behind our first contribution is to add the query expansion into the network training,forging our new loss.Query expansion was first successfully applied in text retrieval,and worked well,and later some works also applied this method into image retrieval also received better retrieval results.Triplet loss and contrastive loss are adopted by traditional methods,while these 2 kinds of loss heavily depend on the selection of the tuples.All we want to do is to design some loss which need no tuple.In the second contribution,we use the multi-GeM-pooling structure,where GeM pooling can significantly improve the accuracy of image retrieval.However,most former methods employ linear structure,which leads to the increasingly missing of images' details when network gets deeper.Whereas we extract feature maps from different layers of network,and then we convert the feature maps into image descriptors by gem pooling,finally we fuse the image descriptors into one,thus to preserve the images' details.Following chum et al.2018's experiments,we adopt retrieval-sfm-120 k as our training set,and Oxford5 k,Paris6k as testing set,equipped the improved triplet loss and fused feature vectors,presenting results that outperforms most of the state-of-the-art methods. |