Font Size: a A A

Local Visual Information Based Large-Scale Image Retrieval

Posted on:2016-07-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z LiuFull Text:PDF
GTID:1228330470958013Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the rapid growth of techniques including computer science, electronic com-munication and multimedia technique, people can obtain information and share them with other users on the Internet. On the whole Internet, the proportion of image data is growing continuously, which has attracted a lot of attention of both academic and industry. To get the useful information from these massive multimedia data, we need advanced information retrieval techniques. One important problem is how to find the duplicate or the partial duplicate images from such a huge amount web images. To solve this problem, we need the algorithm to represent the visual information reasonably and efficient indexing techniques.In the content-based image retrieval systems, given a query image, it needs to return a rank list of the images in the database. The rank is computed based on the similarities between the query and the database images, which rely on the number of the matched local features. The matched features refer to similar image patches with same visual content. The most direct way to find the matched features is computing the Euclidian distance between two features’descriptors. However, it is impractical in large-scale database since the features’descriptors are usually high dimensional, for ex-ample128-D SIFT. In this paper, we focus on large-scale image retrieval based on the local visual information, which includes the contextual descriptor, and flexible SIFT coding, and cross-indexing, and local visual information fusion, and compact image representation. In the following, we introduce them briefly.Firstly, we propose a novel algorithm to describe the contextual information of local features based on the spatial relationships between the local features in a single image. There are two kinds of spatial relationships, namely the multimode relationship and the co-occurrence relationship. The multimode relationship refers that several local features are located in the same spatial position but with different scales or orientations. The co-occurrence relationship refers that local features occurred in the same image but with different spatial positions. We design a binary descriptor to describe these two kinds of context to enhance the discriminability of local features.Secondly, we propose a flexible SIFT binarization algorithm to encode SIFT fea-ture into a binary code and a cross-indexing algorithm to improve the recall of query feature. In the traditional retrieval system, the visual feature is quantized with a visual codebook which is trained off-line by the unsupervised clustering method. However, due to the limited codebook size trained with the clustering method, the visual feature is quantized coarsely. To avoid this drawback, we propose a SIFT binarization algorithm to quantize the SIFT feature into a distance-preserving binary code. What’s more, to improve the recall of the query binary feature, we design a novel indexing structure, namely the cross-indexing structure.Thirdly, considering the complexity of the traditional retrieval system, we propose to tackle the visual features in a batch mode, namely the uniting keypoints. Since thou-sands of local features can be extracted from a single high resolution image, the number of local features of a large database will be huge. If we process the local features in-dependently, the retrieval system will suffer very high computational complexity. To address this problem, we propose to reorganize the local features in an image into a dozen feature groups and index these feature groups. To make the retrieval system more faster, a dichotomizing search approach is proposed.Fourthly, we propose a method to build the global image representation from local visual features. That how to represent the visual content of an image into a vector is one of the fundamental problems in multimedia and computer vision field. That constructing global image representation from local visual features can inherit the robustness of local features to obscure, translation, scaling, and rotation. We elaborate on the impact of the residual vectors between the local feature and the visual word to the constructed global image representation.In a nutshell, in this thesis, we explore and discuss techniques about near-duplicate image retrieval. We propose many novel algorithms to tackle the existing problems of image retrieval including contextual descriptor, flexible SIFT binarization, cross-indexing, and compact image representation. Comprehensive experiments demonstrate the effectiveness and the efficiency of our proposed algorithms.
Keywords/Search Tags:Image Retrieval, Contextual Descriptor, Visual Features, Binarization, In-verted Indexing, Cross-indexing, Uniting Keypoints, Dichotomizing Search, CompactImage Representation
PDF Full Text Request
Related items