| With progress of science, information technology develops rapidly and is widely used in various fields such as medical image processing; computational biology, and global climate models, etc.Data dimension gets larger and larger. High-dimensional data thus emerges. But high-dimensional data is difficult to process for current algorithms of machine learning and data mining effectively. Dimensionality reduction algorithm is one of the most important tools to deal with high-dimensional data.As a method of high-dimensional data reduction, manifold learning achieves wide range of application in nonlinear dimensionality reductionAmong these algorithms of manifold learning, principal component analysis (PCA) is a one based on assumption that overall data set is linear. However, with scale of data getting larger and larger, speed of data processing has increasingly become the focus of attention. But we don’t want to reduce the time complexity at the expense of the algorithm accuracy because that would incur the fact that dimension reduction or classification data cannot faithfully reflect the original data information.The main work of this paper is as follow:1. We give a general overview of dimensionality reduction algorithm, mainly focus on the two algorithms-ISOMAP and LLE, and we point out ISOMAP’s disadvantage of time-consuming when the Euclidean distance is changed to the geodesic distance in search of neighbors of each point. We give a brief comparison of the performance of LLE when we choose different "K"-neighbors of each point, and also a brief introduction of anisotropic algorithm.2.In the case of large-scale data set especially when numbers of rows and columns are both more than three thousand, We point out the most time-consuming step in the PCA algorithm, show the important role of three kinds of random matrix and greedy algorithm in reducing time complexity of the algorithm in the process of PCA.In the case the accuracy is not so much required,we introduce a method to accelerate the algorithm with estimated deviation(less than 5%),and compare the calculation time and low dimensional embedding deviation measured by the eigenvectors of two modes and standard PCA... |