Font Size: a A A

Sparse Representation Fordimensionality Reduction

Posted on:2012-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:S L HuangFull Text:PDF
GTID:2218330344951314Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Many real-world applications generate high-dimensional data, such as bioinformatics data, image data, remote-sensing data in Astronautics. Processing these data encounter two types of problems: (1) The expansion of dimensions could bring great challenge to the tasks such as pattern recognition and rule mining. Case in point, it is time-consuming to process high-dimensional data, which makes many algorithms inapplicable to real-world problems. (2) There is plenty of information contained in high-dimensional data with diversified data structure, and the growth of dimensions makes it possible to utilize more new proposals to solve the problems. Therefore, dimension reduction has long been an active and hot research topic in Machine Learning and Data Mining Community. However, in recent years, sparse representation has been demonstrated to provide new research ideas for signal processing and machine learning. The theory about sparse representation has improved over time and researchers pay more attention to its application. In this thesis, we studied the application of sparse representation, and proposed a new unsupervised dimension reduction algorithm. The proposed algorithm aims to solve the problem of lower discriminant that exists in many dimensionality reduction algorithms when it comes to training dataset with under-sampling or limited data size. Furthermore, we propose a new framework to handle conventional classification problem based on this algorithm.We firstly provide a brief literature survey about dimension reduction and discuss the problems exist in the current research, such as training dataset with under-sampling, lower discriminant after reducing the dimensions, the challenges of classifying or clustering high-dimensional data. We also compare common linear dimension reduction algorithms and recognize their limitations and real-life applications. And then we explore and analyze sparse representation approaches, present current research about sparse representation, and summarize the main idea about common sparse representation methods, such as Basis Matching Pursuit (BMP), Order Recursive Matching Pursuit (ORMP), Modified Matching Pursuit (MMP), Orthogonal Matching Pursuit (OMP), evolutionary multi-resolution matching pursuit and reverse delete algorithm. We also analyze their advantages and disadvantages. The most important contribution of this paper is that we proposed a new unsupervised nonlinear dimensionality reduction algorithm called Sparse Reconstruction Embedding(SRE) by analyzing the strong and weak points of common dimension reduction algorithms. The algorithm can reduce high-dimensional data to low-dimensional space by applying the theory of sparse representation appropriately without supervised learning. Furthermore, the obtained low dimension data exhibit good discriminant, and could boost the efficiency of the tasks such as classification and clustering. The main idea of SRE can be divided into two stages: sparse reconstruction and low-dimension embedding, in the meantime, we provide the whole processes of modeling and solution of these two stages. In this paper, we explain why SRE try to preserve the property of sparse linear reconstruction and why using l 1 norm to make sure the solution is sparse. Experimental results on artificial data and real-world data show that our approach is much more discriminant and less sensitive to under-sampled and intersecting data. We also applied the proposed algorithm to some real-life problems such as image recognition, text classification and visualization. Later, we made a brief exploration about how to introduce supervised learning into our algorithm.
Keywords/Search Tags:Dimensionality Reduction, Sparse Representation, Linear, Nonlinear, Image Recognition
PDF Full Text Request
Related items