Font Size: a A A

Research On Manifold Learning Algorithms And A Few Applications

Posted on:2010-11-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q G WangFull Text:PDF
GTID:1118360275974187Subject:Instrument Science and Technology
Abstract/Summary:PDF Full Text Request
With the quick advancement and extensive application of information technology, more data with high dimension and complex structure occurs very quickly. High dimension not only makes the data hard to understand, and makes traditional machine learning and data mining techniques less effective. How to reduce the high dimensional data into the low dimensional space and discover the intrinsic structure have become the pivotal problem in high dimensional information processing. The main purpose of manifold learning algorithms is to detect the intrinsic structure embedded in the high dimensional data space, which have been a focused research field in machine learning and pattern recognition.In this dissertation, some key issues on manifold learning have been studied, the main contributions are summarized as follows:Baed on the analysis of PCA and MVU, we propose a new nonlinear dimensionality reduction method called distinguishing variance embedding (DVE). By constructing the neighborhood graph and non-neighborhood graph, DVE deals with the sample variance distinguishingly, that maximizes the global variance and simultaneously preserves the local variance. DVE can be viewed as the nonlinear counterpart of PCA, and also be viewed as a variant of MVU that relaxes the strict distance-preserving constraints. As a global algorithm for nonlinear dimensionality reduction, DVE can detect the global geometric structure of data set in the high demensional space. Compared with MVU and ISOMAP, the computation intensity and storage demands of DVE are drastically reduced. DVE can also effectively deal with the conformal data set while ISOMAP and MVU fail for their isometric property.Despite the computational complexity of DVE greatly reduced when compared with ISOMAP and MVU, the eigen-decompose of dense matrix in DVE makes it can't meet the real-time data processing requirements in the real world. In order to solve this problem, the landmark version of DVE is proposed. Subject to the constraint that the total sum of distances between neighboring points remain unchanged, landmark DVE unfolds the data manifold in the low dimensional space by pull the randomly selected landmark points as far apart as possible. The main optimization of landmark DVE involves an eigen-decompose of spare matrix. Compared with DVE, the computation intensity and storage demands of landmark DVE are effectively reduced. Like other manifold learning algorithms, DVE has no straightforward extension for out-of-sample examples as it can't get a mapping function. In order to solve this problem, the linear approximation of DVE is introduced, which is called distinguishing variance projection (DVP). Similar to DVE, DVP can detect the global structure of high dimensional data set and simultaneously preserve its local neighborhood information in a certain sense. DVP can be viewed as an effective complement of classical PCA and LPP.As an unsupervised dimensionality reduction algorithm, DVP can't ensure that the data in different category be separated well in low dimensional subspace. As DVP deals with the data points in pairwise manner, the algorithm can be performed in supervised manner by taking the label information into account, called supervised distinguishing variance projection (SDVP). By constructing the intra-class neighborhood graph and inter-class graph, SDVP seeks the low dimensional subspace in which the intra-class local scatter of data is minimized and at the same time the inter-class scatter is maximized. SDVP can be viewed as a local variant of LDA, and MFA can be viewed as a local variant of SDVP. SDVP is suitable for the classification tasks of multi-modal and manifold data set. The experiments on the UCI machine learning databases and standard face databases prove the effectiveness of the algorithm.
Keywords/Search Tags:manifold learning, dimensionality reduction, variance analysis, face recognition, data visualization
PDF Full Text Request
Related items