The main goal of machine learning and data analysis is to find the intrinsic laws of high dimensional data set. Traditional analysis methods suppose that the structure of a data set is usually linear, e.g. the hidden variables of the data set are independent and incorrelate. However, the traditional methods can hardly find the true structure of the data set because of its large amount, high dimension, quick growing rate, and nonlinear characteristics. Thus, manifold learning has appeared in researchers' sights.Manifold learning research involves topology, graph theory, machine learning, pattern recognition, signal precessing, computer vision, and etc. Manifold learning methods can effectively find the intrinsic geometric structure in high dimensional data, and dig out the feature information and the inherent laws. As a novel machine learning and high dimensional data analyzing tool, manifold learning has been a hot research spot, and gradually applied in biometrics, high-dimensional data analysis, and etc.The thesis has conducted some deep theoretical investigation in manifold learning methods, including Locally Linear Embedding, Isomap and Laplacian Eigenmaps. To overcome several limitations in manifold learning, we have presented some improving versions. In addition, some applications on face recognition and high dimensional data are proposed based on the presented algorithms. The main contributions are as follows:1. Presenting Growing Locally Linear Embedding. The intrinsic dimension estimatation of the data is a key issue in manifold learning. According to Denseness Hypothesis, manifold learning algorithms have to burden a high computational complexity, which reduce the applicability seriously. This thesis introduces Growing Neural Gas model into Locally Linear Embedding (LLE), constructes a sparse graph which covers manifold with the competitive Hebb learning rule, and presents Growing Locally Linear Embedding (GLLE). The novel algorithm has solved three shortcomings of the original LLE. GLLE can estimate the intrinsic dimension adaptively, select the neighborhood size dynamically, and reduce the complixity obviously. Simulation results have shown the effective applications of GLLE in manifold unfolding, visualization of high dimensional data and biometrics.2. Solving the noise perturbation in manifold learning. Manifold learning algorithms can find the intrinsic information of high dimensional data based on the local geometric structure preservation. This characteristic makes manifold learning methods sentitive to noise because noise perturbation may change the local structure of data. This thesis presents a novel manifold learning algorithm for noisy manifold, namely Neighborhood Smoothing Embedding (NSE), based on the Local Linear Surface Estimator (LLSE). The algorithm proposes an effective solution to noisy manifold, and a new perspective on robust manifold learning.3. Generalizing a common framework for manifold learning. Based on the out-of-sample framework and the graph embedding framework, the thesis presents a more general common framework for manifold learning. The new framework embodies graph construction, tensor learning, incremental learning and supervised learning. Furthermore, the thesis gives a survey on the characteristics of manifold learning algorithms, and indicates the future research directions.4. Supervised manifold learning. Manifold learning algorithms are mostly unsupervised methods. They can effectively perform dimensionality reduction and data visualization. However, how to extract data features and classify using lable information is a key point in manifold learning applications. The thesis embeds the supervised manifold learning into the common framework. Thus the supervised manifold learning will construct the graph according to the lable information, partition the training data space, and classify the test data. The experiments on face expression recognition have shown the effectiveness of supervised manifold learning.5. Incremental manifold learning. Manifold learning requires many data samples for learning according to smoothing continuous hypothesis and denseness hypothesis. What's more, Locality Preservation Hypothesis makes manifold learning algorithms incapable of representing new samples. Manifold learning algorithms have to recompute the constructive (geometric or topological) matrix, which reduce the applicability of the algorithms seriously. In the thesis we introduce differential and sub-manifold analysis into Laplacian Eigenmaps, and present an incremental manifold learning method, which has an optimal solution.6. Metric of manifold structure. Neighborhood searching in manifold learning is based on the Euclidean distance, because Riemann space is equivalent to Eclidean space in infinitesimal range. However, sampling density cannot reach this requirement.This thesis presents a new distance metricâ€”â€”diffusion distance, based on the localitypreservation characteristic and Nitric Oxide (NO) dynamic diffusion model. Diffusion distance increases the robustness of manifold learning, and can map circular manifold.7. Building an Asian face database. In the field of face recognition, most face data sets are captured from abroad volunteers. This makes recognition methods not suitable for the oriental face features. Thus it is important and necessary to build our large-scale Asian face data set. We have designed a single camera face acquisition system. Based on this system we have built a small-scale Chinese variant PIEs face dataset, and the data set will be expanded in the future work. |