Font Size: a A A

Research On High-order Feature Coding Methods In Image Classification

Posted on:2017-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:X X LuFull Text:PDF
GTID:2348330488959728Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Bag of visual words (BoW) is one of the most classical models in computer vision, which has widespread applications in image classification, retrieval and action recognition, among others. Feature coding is an important ingredient of BoW model, greatly affecting the final classification performance. The main challenge of image coding lies in how to design efficient coding approaches by using distribution of local features. Most existing coding methods are low-order based methods, while this thesis focuses on how to mine higher order information in image representation and and proposes two novel coding methods.The locality-constrained linear coding (LLC) uses a dictionary composed of visual words, which discards the geometric structure surrounding them and only provides piece-wise approximation of the feature space. To address this problem, this thesis proposes a new feature coding method called locality-constrained affine subspace coding (LASC). The LASC method uses an ensemble of lower dimensional affine subspace as a dictionary; each of the affine subspaces has its origin and basis to model local geometry in feature space. This new dictionary provides a piecewise linear approximation of the data space, having more powerful representation capability than traditional dictionaries. LASC adopts local coding strategy where each feature is linearly decomposed in each top-k nearest subspaces and then weighted to form first-order code. Meanwhile, this paper derives second-order LASC based on Fisher information metric to further boost the performance.This thesis presents Fisher vectors with regard to a single full covariance matrix dictionary (FV-COV) for large-scale classification problem based on high-dimensional deep convolutional neural network (CNN) features. High dimensional covariance dictionary directly models correlations between different dimensions. Compared to diagonal Gaussian Mixture Model used by traditional Fisher vector method, the proposed dictionary makes a full use of second-order statistical information in a more appropriate way and can be estimated easily. It avoids the numerical stability and computational inefficiency in high-dimensional dictionary learning. The performance of FV-COV is better than Fisher vector, facilitating usage of high-dimensional CNN features in large scale image classification task.The thesis makes extensive experiments by using both the traditional, hand-crafted features and CNN features. The proposed LASC is readily applicable to image classification and image retrieval tasks, achieving very competitive performance in a wide variety of standard benchmarks. Our FV-COV achieves state-of-the-art results on many large scale datasets, enjoying both efficiency and better performance compared to its competitors.
Keywords/Search Tags:Image Classification, Feature Coding, High-order Coding, Convolutional Neural Networks
PDF Full Text Request
Related items