Font Size: a A A

Research On Indexing And Retrieval Techniques In Large-Scale Image Database

Posted on:2004-05-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:H J YeFull Text:PDF
GTID:1118360122967315Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Relevance feedback techniques and high dimensional indexing schemes are two significant research issues for content-based image retrieval in large-scale image database. Relevance feedback techniques are important approaches closing up the semantic gap between high-level concepts and low-level features in image retrieval effectively, and efficient indexing schemes for high-dimensional data are required for real-time retrieval in large-scale image database.Accurate estimate of data distribution and efficient partition of data space are key problems in high-dimensional indexing schemes. In this dissertation, an indexing scheme using vector quantization (VQ) is proposed for exact nearest neighbor (NN) searches. The VQ-based approach assumes a Gaussian mixture distribution, which fits real-world image data reasonably well. After estimating this distribution through EM method, this approach trains optimized vector quantizers to partition data space. Experiments on a large real-world dataset demonstrate a remarkable reduction of the amount of accessed vectors in exact NN searches compared with existing indexing schemes.Based on the VQ-based indexing method, a hierarchical indexing scheme is proposed for higher performance. This approach integrates VQ-based indexing structures with approximate NN searches and performs probabilistic approximate NN searches on approximate vectors. Experiments show the presented hierarchical indexing scheme outperforms the original VQ-based indexing method and probabilistic approximate NN searches. Both presented approaches support quadratic-form distance metric and can integrate with relevance feedback techniques for practical large-scale image retrieval systems.Extracting feature suitable to representing query concepts of users from small and asymmetric feedback samples is a key problem for relevance feedback techniques. In this dissertation, a relevance feedback method using feature subspaces analysis (FSA) is proposed for content-based image retrieval. This approach considers relevance feedback as a classification problem of two-class. Both distance in feature subspace and distance from feature subspace are employed and a two-stage discriminant analysis method is adopted to make the decision rule for image retrieval. The employed feature criterions treat asymmetry between positive and negative samples and represent query concepts of users reasonably well. Experiments on a large database of real-world images demonstrate the FSA approach achieves more stable and precise retrieval results compared with existing linear approaches.The linear approaches do not fit the data distribution of real image database very well. A novel approach of binary component discriminant function (BCDF) is proposed by generalizing the original quadratic-form distance metric. The BCDF approach nonlinearly extracts features by scatter criterion and distance criterion. Experiments show that the BCDF approach outperforms linear approaches and has good computation efficiency. The kernel function method is a general approach dealing with nonlinearity and existing kernel-based relevance feedback methods don't deal with asymmetry between positive and negative samples very well. A novel approach name hyper-sphere support vector classification (HS-SVC) is proposed to address the sample asymmetry. This method learns a classifier by positive and negative samples and simultaneously considers the asymmetry between them reasonably well. Experiments show a notable precision improvement from other kernel-based methods.
Keywords/Search Tags:content-based image retrieval, relevance feedback, indexing scheme, machine learning, vector quantization
PDF Full Text Request
Related items