| Along with the wide application of Internet technology and electronic devices, the number of images grows exponentially. It has become increasingly important that find the desired image for users from the mass of images quickly and accurately. For a long time, by the impact of text retrieval, most of the technology to solve this problem focuses on image annotation and according the match degree of key words and image annotation to retrieval the image, which is called text-based image retrieval. With the explosive growth of the number of images, the limitations of this approach are increasingly obvious. Annotating images by hand not only costs so much human labors, but also has subjectivity, which leads to the inaccuracy of the retrieval results.Text-based image retrieval has become increasingly unable to meet people’s needs, so it is proposed content-based image retrieval. The idea of this method is to characterize the image by extracting the image features, and use the feature to calculate the similarity between the images. There are too many factors affect the final appearance of an image, such as color, light, shadow, the people’s camera angle and so on. So in order to characterize an image accurately, people have designed a variety of feature extraction algorithms to consider different factors. However, the representation power of the hand-crafted features is limited and the effect is not satisfactory.With the proposed learning algorithm, people have started to focus on how to learn the feature directly from the original image. Further, convolutional neural networks has made great achievements in some problems including image classification because of its strong feature extraction capabilities, so we focus on how to apply the convolutional neural networks to the field of image retrieval and an image-ranking algorithm based on convolutional neural networks and spatial pyramid matching is proposed. In this method, the similarity between the image patches combines with the spatial information between the image patches as the measurement of the similarity between the images. Firstly, a similarity model based on convolutional neural networks needs to be trained to measure similarity between the image patches. Then using this convergence model, we can obtain a set of features that extracted from the image patches of an image. Finally, the matching degree between the feature sets of the images is calculated, which is used as a measurement of the similarity between the images. In this process, we adopt fast nearest neighbor search algorithm to ensure search accuracy and efficiency. Meanwhile, in order to consider the influence of the spatial information of the matched image patches, we design an algorithm, which is similar to the spatial pyramid matching, to combine the spatial position information of the matched image patches into the measurement of similarity between the images. The experimental results show that the algorithm we proposed has higher precision than the deep-classification model and the deep-ranking model, so the proposed method has a certain research and application value. |