Font Size: a A A

Indexing, searching, and mining large-scale visual data via structured vector quantization

Posted on:2015-03-14Degree:Ph.DType:Dissertation
University:The Florida State UniversityCandidate:Yuan, JiangboFull Text:PDF
GTID:1478390020951380Subject:Computer Science
Abstract/Summary:
This dissertation is centered on indexing, searching, and mining methods for large-scale and high-dimensional visual data. While the processing to such data has been widely acknowledged to be difficult, the problem becomes more serious when we encounter "big data'', which has shifted the focus of many problems in computational science. There are urgent requirements of new approaches to processing the huge collections of visual information, e.g., images/videos on the Internet.;The study first investigates difficulties of similarity search in high-dimensional spaces, and presents a new model of local intrinsic dimensionality that better fits to similarity search problems, e.g., the nearest neighbor search. Then it turns the focus to discussions of the advantages and the problems when various structured vector quantization applying to the large-scale visual data processing. While many structured vector quantization models can be found, this study is focused to three families of them, including product quantization (PQ), residual quantization (RQ), and tree-structured vector quantization (TSVQ).;The main contributions of this work can be seen in following pipelines: 1) Two novel methods have been proposed to tackle a problem that exists in RQ for decades, and they have been used to improve residual k-means trees for scalable clustering, and to optimize two most advanced ANN search systems; 2) a new inverted index has been proposed for fast approximate nearest neighbor (ANN) search; 3) a systematic framework has been proposed for repetition mining in long video streams; 4) a tree embedding PQ model has been proposed to improve PQ codes quality for ANN search. The experimental results have shown the proposed methods are substantially better than the existing solutions in terms of trade-off among speeds, memory usage, and accuracy.
Keywords/Search Tags:Visual data, Search, Structured vector, Vector quantization, Large-scale, Mining, Methods, Proposed
Related items