Many pattern recognition and data mining problems,such as face recognition, digital image recognition and data visualization,involve data in very high dimensional spaces.The high feature dimensionality of data not only burdens the computational requirement of algorithms,but also cantains redundancy and obscures the intrinsic structures of data.Dimensionality reduction is an effective tool to deal with this problem,which can help to probe into the essential structure of the input data and contributes to accomplish desired learning tasks at low computational cost. As a result,the research on dimensionality reduction has always been important in related scientific fields.This thesis focuses on the theories and methods of dimensionality reduction for high dimensional data,as well as related applications in face recognition.The main contents and achievements are as follows:1.The characteristics and advantages of existing dimensionality reduction algorithms are summarized from global statistic-based and local geometry-based perspectives.The internal relations of various algorithms are also analyzed.2.Both the classical PCA and KPCA algorithms,implemented in the sense of least mean squared error,have the deficiency of instability when input data are spoiled by outliers.And even small amount of outliers will obviously deteriorate the performance of standard PCA and KPCA algorithms.To deal with this problem,we propose a new robust nonlinear principal component analysis technique called IRobust KPCA.The algorithm can effectively eliminate the effect of outliers,and produce an accurate nonlinear subspace.In addition,IRobust KPCA computes iteratively and shows the potential of expansibility to the incremental learning version. The comparative experimental results with standard KCPA demonstrate the effectiveness and robustness of IRobust KPCA.3.Focusing on the dimensionality reduction for manifold learning and pattern classification on high-dimensional data,we propose a new supervised dimensionality reduction algorithm.The classical LDA method considers only the global statistical information of samples and tends to fail in dealing with nonlinear distributed data.While the manifold learning algorithms have shown great power in discovering the intrinsic structures of high dimensional data.Therefore,we utilized the locality preserving idea and developed a new algorithm called Sub-manifold Discrimiant Analysis (SMDA).SMDA finds the low-dimensional embeddings of the input data by maximizing the sub-manifold margin while maintaining the neighboring relations of samples.In addition,an optimized process of intrinsic structure discovery is adopted to avoid the limitations of existing locality preserving based methods.The experimental results on Yale and UMIST face databases domenstrate the effectiveness of SMDA and the supiority to popular PCA,LDA,LPP and MFA algorithms.4.Considering the semi-supervised learning framework based on manifold regularization,we propose a method called MLapRLS.In MLapRLS,a nearest neighbor graph is constructed firstly to model the intrinsic geometrical structure of the data space,and then the graph structure is incorporated into the objective function of the Multivariate Linear Regression as a regularization term.Aiming to extract effective features for the semi-supervised multi-class problem,MLapRLS can make use of all limited labeled samples and large amount of unlabeled samples.The experimental results on Extended YaleB and PIE face databases domenstrate the effectiveness of MLapRLS. |