Font Size: a A A

Compact Aggregated Descriptors For Mobile Visual Search

Posted on:2015-01-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:J LinFull Text:PDF
GTID:1488304322950739Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Thanks to the rising of high resolution camera, powerful CPU and3G connection, smart phones and tablet PCs have shown great potentials in mobile visual search (MVS). MVS poses a unique set of challenges, including the poor retrieval performance over large scale dataset, the query delivery latency via slow and unstable wireless network and the limited computation and memory resources on mobile clients for feature extraction. To address these issues, the Moving Picture Expert Group (MPEG) started an ongoing international standard, namely, Compact Descriptors of Visual Search (CDVS). Specifically, compact descriptors are required to fulfill compactness, discriminability, scalability and low complexity.Based on the evaluation framework of MPEG CDVS, this thesis focuses on the locally aggregated image-level descriptors and the contributions are three folds:1. We propose a selective aggregated image-level descriptor based on interest points selection. This method employs likelihood ratio test to predict the reliability of local features by using the statistics of interest point characteristics (such as scale, orientation, etc), then selects the local features with high reliability scores for subsequent aggregation. The selective aggregation (SA) removes noisy local features and enhances the discriminative power of aggregated descriptors. The results show that our approach not only significantly improves the retrieval accuracy, but also is robust to the JPEG compression artifacts.2. We propose a multi-codebook learning based zero-order locally aggregated compact BoW descriptor. The main idea is to reduce the quantization error as well as memory footprint by learning multiple small codebooks (i.e., multiple feature space partition), then use the learned multi-codebook to generate an image-specific codebook for an image. This leads to scalable yet compact BoW descriptors. The results show that our approach not only obtains low bit rate scalable descriptors and low memory cost of codebooks, but also achieves better retrieval accuracy than state-of-the-art.3. We propose a rate-distortion optimized high-order locally aggregated compact descriptor, namely, Rate-adaptive Compact Fisher Codes (RCFC). RCFC produces bit rate scalable vector representation, which is adaptive to the bandwidth fluctuation in wireless environment. In particular, RCFC supports fast Hamming distance computation between variable length descriptors; meanwhile, low memory footprint is offered. Extensive evaluation over benchmark databases shows that RCFC significantly outperforms the state-of-the-art. It is worthy to note that our method has been adopted as a key technology in the Committee Draft of the ongoing MPEG CDVS standard.
Keywords/Search Tags:Mobile Visual Search, SIFT, Aggregated Descriptors, MPEG, CompactDescriptors, Selective Aggregation, Bag-of-Words, Visual Vocabulary, Fisher Vector, Scalar Quantization
PDF Full Text Request
Related items