Font Size: a A A

The Application And Research Of Image Classification Based On Matrix Factorization

Posted on:2015-12-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Z LongFull Text:PDF
GTID:1228330452466654Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As one of the most important and challenging tasks in computer vision and image processingfields, image classification has recently attracted many researchers’ attention. In this thesis, im-age classification includes natural image classification and face recognition, how to improve theclassification accuracy efficiently and reduce the computation cost is an urgent problem. The im-provements for natural image classification based on discriminative model focus on the followingaspects: i.e., feature extracting,dictionary learning, feature encoding and spatial pooling. Facerecognition based on dimensionality reduction techniques aims to seek a mapping function, whichcan transform the high dimensional face data into a low dimensional space and use the intrinsicgeometrical structure information.This thesis concentrates on image classification based on matrix factorization. We put forwardalgorithms for natural image classification and face recognition from feature encoding and dimen-sionality reduction techniques respectively. Meanwhile, a fast algorithm for recovery of jointlysparse vectors based on the alternating direction methods is given. The main innovations of thisthesis are as follows:1. In the natural image classification based on discriminative model, in order to get the best clas-sification rate, the optimal dictionary learning method and feature encoding strategy shouldbe used simultaneously. However, researchers recently have found that feature encodingwas more important than dictionary learning. When sparse coding scheme was employedto encode features, satisfactory classification results were obtained even if we used randomdictionary. According to this discovery, this thesis proposes an image classification frame-work based on nearest neighbor basis vectors. There are two ways for generating dictionaryduring dictionary learning phase, i.e., by using k-means clustering and by random samplingsift matrix of images, then, we use our soft inner product coding method to encode features.After feature encoding, each descriptor of image is linearly represented by its several nearestneighbor basis vectors. Combined with spatial pyramid matching model and max poolingfunction, we can get the final representation of each natural image. Experimental results on15Scenes and UIUC Sports Event datasets show that the classification rate and calculating speed of our scheme outperform some classical algorithms.2. Face recognition based on traditional non-negative matrix factorization models do not con-sider the geometrical structure information and label information of data simultaneously.This thesis takes advantage of the manifold learning technique to construct a graph Lapla-cian matrix, which is used to describe the relationship between training samples, we alsogenerate a class indicator matrix according to the label information of samples. Then, weput the graph Laplacian matrix and class indicator matrix into the objective function as tworegularization terms, and introduce a graph regularized discriminative non-negative matrixfactorization algorithm. We provide the corresponding multiplicative update solutions for theoptimization framework, together with the convergence proof. The projected matrix learnedfrom our algorithm is used to reduce the dimensionality of face image. A series of experi-ments on four benchmark face datasets to demonstrate the efficacy of our proposed method.3. Combining the spatial pyramid matching model appeared in natural image classification withthe non-negative property of SIFT, this thesis presents a sparse non-negative matrix factoriza-tion algorithm based on spatial pyramid matching. The corresponding multiplicative updaterules and convergence proof of sparse non-negative matrix factorization are given in ourpaper, the update rules are used to learn a dictionary and to encode features respectively.According to the coefficient matrix obtained from feature encoding, we employ the threelevels spatial pyramid matching model and max spatial pooling function to get the final im-age representation. Experimental results on several benchmark face datasets show that theclassification efficiency of the proposed scheme outperforms PCA, LDA, LPP and NMF.4. The standard compressive sensing (CS) aims to recover sparse signal from single measure-ment vector which is known as SMV model. By contrast, recovery of sparse signals frommultiple measurement vectors is called MMV model. In this thesis, we consider the recov-ery of jointly sparse signals in the MMV model where multiple signal measurements arerepresented as a matrix and the sparsity of signal occurs in some rows. The sparse MMVmodel can be formulated as a matrix (2,1)-norm minimization problem, which is much moredifficult to solve than the L1-norm minimization in standard CS. In this thesis, we propose avery fast algorithm, called MMV-ADM, to solve the jointly sparse signal recovery problemin MMV settings based on the alternating direction method (ADM). The MMV-ADM alter-nately updates the recovered signal matrix, the Lagrangian multiplier and the residue, and allupdate rules only involve matrix or vector multiplications and summations, so it is simple,easy to implement and much faster than the state-of-the-art method MMVprox. Numericalsimulations show that MMV-ADM is at least dozens of times faster than MMVprox withcomparable recovery accuracy.
Keywords/Search Tags:Spatial Pyramid Matching, Feature Encoding, Non-Negative MatrixFactorization, Manifold Learning, Dimensionality Reduction, Alternating DirectionMethod
PDF Full Text Request
Related items