Font Size: a A A

Scene Image Invariant Feature Extraction And Classification Methods

Posted on:2014-10-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q LiFull Text:PDF
GTID:1268330401463076Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Image classification is a fundamental problem in computer vision and has attracted a lot of attention in recent years. Current research converges on leveraging bag-of-words (BoW) representation combined with spatial pyra-mid matching (SPM). Such scheme provides an effective way of capturing im-age statistics for natural scene classification and reports state-of-the-art perfor-mance. The bag-of-words (BoW) model is a simplifying assumption used in natural language processing and information retrieval. In this model, a text (such as a sentence or a document) is represented as an unordered collection of words, disregarding grammar and even word order. Computer vision re-searchers use a similar idea for image representation (Here an image may refer to a particular object, such as an image of a car). For example, an image can be treated as a document, and features extracted from the image are considered as the "words". The BoW representation serves as the basic element for further processing, such as object categorization. The key idea is to quantize each ex-tracted key point into one of visual words, and then represent each image by a histogram of the visual words (Visual codebook). For this purpose, a clustering algorithm (e.g., K-means), is generally used for generating the visual words. A number of studies have shown encouraging results of the bag-of-words rep-resentation for object categorization. Based on the BoW model, the research on representation of image invariant feature and scene image categorization method is presented in this thesis.Currently, codebooks are typically created from a set of training images using a clustering algorithm. However, these codebooks are often functionally limited due to redundancy, we use the newly proposed statistics of word activa-tion forces (WAFs) to reduce the redundancy in the codebook used in the BoW model. The experimental results show that WAFs can remove the redundancy efficiently. In such a way, the representation of image features is improved.In addition, we propose a method using inverse document frequency (IDF) to optimize BoW based image features, which is called Soft-IDF. Given visual words and the dataset, each visual word appears in different amount of images and also different times in each particular image. Some of the visual words appear rare in contrary to the frequent ones. The proposed method balances this case. Experiments show encouraging results in scene categorization by the proposed approach.A reference-based algorithm for scene image categorization is presented in this paper. In addition to using a reference-set for images representation, we also associate the reference-set with training data in sparse codes during the dictionary learning process. The reference-set is combined with the recon-struction error to form a unified objective function. The optimal solution is efficiently obtained using the K-SVD algorithm. After dictionaries are con-structed, Locality-constrained Linear Coding (LLC) features of images are ex-tracted. Then, we represent each image feature vector using the similarities between the image and the reference-set, leading to a significant reduction of the dimensionality in the feature space. Experimental results demonstrate that the reference-based algorithm achieves outstanding performance.Reference-based image classification approach introduces a reference-set for both image representation and dictionary learning. It significantly reduces the dimensionality of represented images and shows outstanding performance even with randomly selected reference images and simple distance measure. In this paper, we improve upon existing work with two major contributions. First, we show that a more representative reference-set contributes to better classification accuracy. To this end, we carefully adapt the K-means cluster-ing algorithm in the feature space to select a distinguished reference-set. Sec-ond, in the image classification process, we propose to represent each image by measuring its betweenness centrality in a social network composed of the representative reference-set in each class, leading to a more coherent distance measure that considers the overall connectivity between the probe image and the reference-set. Extensive experiment results demonstrate that our proposed scheme achieves better performance than existing methods.
Keywords/Search Tags:Scene image categorization, feature extraction, WAFs, IDF, BoW, Reference-based
PDF Full Text Request
Related items