| With the rapid development of digital information,the databases of various network platforms and social software store tens of billions of images.How to quickly and accurately retrieve the images needed by users from massive images is the key of image retrieval task.In recent years,content-based image retrieval has become a research hotspot in this field,and feature matching is also the most important part.With the rapid development of deep learning,the traditional methods based on the underlying features of images such as color,texture and shape have certain limitations.Compared with them,the methods based on deep learning can obtain richer feature information,but affected by the high similarity descriptors in different local regions,the mismatch problem reduces the accuracy of image retrieval.Therefore,how to accurately and quickly obtain image retrieval results has become the key to solve the above problems.The main work of this paper is as follows:(1)The visual feature map of each convolution layer is obtained through the semantic segmentation network.According to the characteristics of the last convolution layer with stronger spatial information and semantic information,the weight of each channel in the feature map is obtained by using the gradient score,weighted sum and normalization in the channel dimension by means of linear fusion,and the final thermal map is obtained by bilinear interpolation.(2)A depth local feature extraction model based on attention is proposed.The model uses the residual network to extract dense local features,generates the attention score of the feature through the attention network,screens the feature points related to the retrieval object,and obtains the depth feature descriptor by using the principal component analysis method.(3)Using the category information and thermal value information obtained from the semantic segmentation network,a multi-dimensional composite thermal feature descriptor is constructed,and a k-dimensional tree structure for this kind of feature descriptor is given.Based on this structure,feature matching is realized by combining best bin first(BBF)and random sampling consistency algorithm.Experiments are carried out on oxford5k and paris6k data sets.The experimental results show that this method can better avoid the mismatch problem caused by high similarity descriptors in different regions.Compared with deep local features(DELF)and D2 net algorithm,the precision and time efficiency are improved.Compared with fine tuning CNN,Dame web and other methods,the retrieval accuracy is improved by nearly 2%.The experimental results verify the effectiveness of this method.The paper contains 37 pictures,9 tables,and 70 references. |