Font Size: a A A

Sparse Feature Learning For Image Retireval And Classification

Posted on:2014-01-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:W M XuFull Text:PDF
GTID:1268330425473841Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
High Dimensionality and Sparsity are important research topics in the field of MachineLearning. With the rapid development of information science and the dramatical increase inamount of image data, Image Sparse Representation has become a focus of the research issues onimage representation. Sparsity can effectively solve the problems brought from high dimensionaldata like images on the storage capacity, the processing speed and the interpretability of itself inthe practical applications. In order to meet the application needs for large-scale content-basedimage retrieval and image classification, the unsupervised feature learning methods areresearched based on image local features in this thesis to achieve the holistic and overcompletesparse image representation which has the advantages of both global features and local features.The main work in this thesis can be summarized as follows:The technologies of image local feature extraction are reviewed. Several representativelocal feature extraction algorithms (including feature detector and feature descriptor) areintroduced; and the application situations for the usage of local features are pointed out so as todraw forth the research motivation of this thesis.To overcome the limitations of set of local features representation used in a large-scaleimage retrieval and classification, the image sparse representation method based on Bag ofVisual Words (BoVW) Model is researched. It takes feature quantization as the core idea andits key technologies include visual dictionary construction based on clustering analysis andfeature encoding based on Vector Quantization.To reduce the bigger feature quantization error in BoVW Mode, the image sparserepresentation method based on Sparse Coding (SC) Model is researched. It regards sparsereconstruction as the basic criterion and its key technologies include overcomplete visualdictionary learning, sparse decomposition and code pooling, etc..To solve the problems of unsmooth regularization functions and high computationcomplexity in SC Model, the image sparse representation method based on Locality Coding (LC)Model is researched. With manifold learning as the theoretical principle, it substitutes thelocality constraint for the sparsity constraint in generic sparse coding to achieve efficientfeature encoding by means of locality reconstruction and obtain the image sparserepresentation.A unified framework for image sparse feature learning based on the above three imagesparse representation models is established and generalized. The specific forms and theimplementation methods of various models are discussed under this uniform framework, andthen a sparse learning method based on image heterogeneous local features is proposed to obtainthe multiview holistic sparse representation.Additionally, the sparse feature learning methods discussed in this thesis are verified andevaluated on several benchmark image databases such as ZuBud, UKBench, Caltech-101andScene15in the application tasks of Content-Based Image Retrieval and Image Classification. The main contributions of this thesis lie in that: Firstly, a visual dictionary constructionmethod via Spectral Clustering is proposed. Spectral Clustering algorithm can converge to theglobal optimum in sample spaces with any data distribution,and thus avoids the problems inK-Means and HKM clustering algorithms which are sensitive to initial clustering centroids andalways converge to the local optimum. Secondly, a strategy to enhance the performance of imagesparse feature learning is proposed in which the non-negative constraint is added to theoptimization functions. Therefore, a non-negative sparse coding (NNSC) Model is established toimprove the standard sparse coding (SC) Model, and the non-negative LLC (NNLLC) method isput forward to improve the original LLC method used in the local coding (LC) Model. Thirdly,an improved local coding method named Local Differential Coding (LDC) is proposed which isbased on the vector difference operations. It only applies the nearest visual words of a localfeature to construct the new differential base vectors so as to maintain the local smoothness, thusencodes the relationship between visual words into the final feature codes. Fourthly, an improvedstrategy for feature pooling named kMaxSum Pooling is proposed. For every local feature foundin an image, this strategy takes its first k maximum responses relative to any visual word in avisual dictionary into consideration, and calculates the sum of these responses. By selecting theappropriate value of k, kMaxSum Pooling performs better than both Max Pooling and SumPooling in image retrieval experiments. Finally, a unified framework for image sparse featurelearning is established from the relations of the above three sparse representation models, and isgeneralized by introducing the concept of image heterogeneity local features and presenting asparse learning method based on it so as to obtain the multiview image sparse representationwhich has improved the performance for image retrieval and image classification in ourexperiments.
Keywords/Search Tags:Sparse Feature Learning, Visual Dictionary, Feature Encoding, Feature Pooling, Non-negative Constraints, Heterogeneous Local Features
PDF Full Text Request
Related items