Font Size: a A A

Fast Similar Image Search In Large Scale Image Database

Posted on:2016-05-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y XiaFull Text:PDF
GTID:1108330473461625Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Similar image search in large scale image database is a classic problem in computer vision, and it is also widely used in real-world applications. Given a query image, a sim-ilar image search system should retrieve images that have the same semantic meaning as the query. This is a changeling task especially when the image database is in large scale. This is because for large scale database, one has to face the problems coming from storage, efficiency, and accuracy. Among these challenges, the querying efficiency is always the key concern in a large scale image search system.In this thesis, we study the topic of fast similar image search in large scale image database. We first introduce some popular features for image representation, and then introduce our novel works from three perspectives.First, we propose an algorithm to build large scale dataset automatically. Large scale datasets are pre-requirements for the learning of popular deep learning features. We find that the reconstruction error in an auto-encoder is a good indicator for iden-tifying outliers. Based on this observation, we further make the reconstruction error more discriminative. Using our proposed algorithm, we can automatically remove the outliers in an unlabeled image set, and then the results can benefit the learning of deep neural networks.Second, we investigate the inverted index. We propose an algorithm named "joint inverted index", by which the image feature space is divided for multiple times and each division is complementary to the others. Using the joint inverted index, one can retrieve similar images from billion-scale database in several milliseconds, with higher accuracy than using existing methods.Third, we accelerate the re-ranking procedure of image features. When ranking image features, the state-of-the-art methods compact the feature vectors into hashing codes, with the consideration that hashing codes can provide fast hamming distance estimation and small storage. However, these method introduce a dense orthogonal matrix to project feature vectors, and this projection step could be very computational expensive for high-dimensional vectors. In this thesis, we propose to use a sparse pro-jection matrix and provide the algorithm to optimize it. With many zero elements in the projection matrix, the projection cost can be greatly reduced. Meanwhile, the sparse projections reduce the effective number of parameters involved in optimization, and thus could alleviate over-fitting.We run various experiments on a number of challenging datasets. The results verify the effectiveness of our algorithms. In addition, we implement a real similar image search system on a large scale image database. Qualitative experiments on this system validate our algorithms.
Keywords/Search Tags:similar image search, automatic outlier removal, joint inverted index, bi- nary encoding, sparse projections
PDF Full Text Request
Related items