The high dimensionality of data is one of the main challenges faced by the current pattern recognition techniques. Dimensionality reduction (DR) has become an important tool to handle high-dimensional data and overcome the"curse of dimensionality". Recent researches showed that most of the DR algorithms can generally reduce to graph construction and its embedding manners. However, many existing graph-based DR methods rely on artificially pre-defined neighborhood graph, e.g., locality preserving projections algorithm and its variants. Despits their success in many practical applications, those algorithms usually suffer from some limitations such as neighborhood parameter selection, sensitivity to noices, insufficient discriminative power and the difficulty of incorporating prior knowledge. This paper is a research on graph construction and optimation, especially for dimensionality reduction task. The main contributions include:(1) The revisit about globality and locality preserving strategies. In particular, we compare several popular locality-oriented DR algorithms with classical globality-oriented DR ones, and obtain a series of new insights (especially some undesirable characteristics involved in locality preserving strategy). Furthermore, we analyze the reasons for the empirical results from the viewpoint of graph construction, and provide specific tricks and suggestions to address such issues. On one hand, these studies clarify the current misunderstanding involved in locality-oriented methods; on the other hand, these show that there exists large space to improve the current DR methods, and become the important motivations of our following study works.(2) The first attempt to construct graph by sparse representation and the design of Sparsity Preserving Projections (SPP) algorithm. By virtue of global strategy for graph construction, SPP alleviates to a certain extent the difficulty of neighborhood parameter involved in locality preserving methods; SPP captures the"neighbors"by l 1-minimization problem potentially and automatically, instead of artificially predefinition which assigns the same neighborhood size for each sample. In addition, SPP benefits from the natural discriminative power of sparse respresentation, and thus achieves better performance than some popular locality preserving DR algorithms for face recognition application.(3) Proposing Sparsity Preserving Discriminant Analysis (SPDA) method and applying it to single labeled training image face recognition problem. Not only does SPDA extend SPP to semi-supervised version, but also unifies sparse graph construction under Bayesian learning framework which facilitates the incorporation of prior knowledge. In addition, we attempt to speed up SPDA by ensemble strategy and design ensemble SPDA algorithm. The experiments on publicly available data sets show that the proposed algorithms achieve better performance than their competitors such as SDA, and tend to work well just resoring to very few extra unlabeled samples.(4) Presenting Soft Locality Preserving Projections (SLPP) method. It is well known that graph plays an important role in typical locality-based DR techniques. However, the graph construction involved in these methods relies on artificial predefinition and is independent of subsequent DR step. To address this issue, we design SLPP algorithm based on LPP, which integrates graph construction with projection learning into one single objective function. With alternating iterative optimization, we obtain a principle way of graph construction. The feasibility and effectiveness of SLPP are verified on several standard data sets with promising results.(5) A unified framework for simultaneous dimesionality reduction and graph learning. Motivated by SLPP, we establish a simultaneous dimensionality reduction and graph updating framework, which is very general and applicable to most of the current graph-based DR techniques. To verify the feasibility and effectiveness of such framework, we further extend the typical LPP and develop Self-dependent LPP (SdLPP) algorithm. The effectiveness of the proposed algorithm is validated by the experiments including data visulization, clustering and face recognition on publicly avaible data sets. |