Font Size: a A A

Research On Unsupervised Dimensionality Reduction Based On Principal Component Analysis And Graph Embedding

Posted on:2021-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:S Z LuoFull Text:PDF
GTID:2480306020467264Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of computer science,the amount of data acquired and stored in many scientific fields has increased exponentially,such as image data,biometric data,web page data,and so on.Therefore,the dimensionality reduction algorithm has become one of the most important tasks in data mining and machine learning research.Because labeling data requires a lot of manpower and material resources,unsupervised dimensionality reduction algorithms are more widely used in practical applications.Principal component analysis(PCA)and graph-based dimensionality reduction algorithms are the two most commonly used unsupervised dimensionality reduction algorithms.However,PCA cannot effectively deal with outliers in small sample problems,making the covariance matrix an unstable estimate.In addition,the graph-based dimensionality reduction technology attributes the original data to the structure of the graph and its embedded form,so the quality of the graph directly affects the quality of the dimensionality reduction performance.However,traditional graph dimensionality reduction techniques perform a dimensionality reduction process by learning a fixed initial graph,and there are problems such as neighbourhood parameter selection,poor robustness,and insufficient discriminability.This article focuses on principal component analysis and graph reduction techniques.The main results are:(1)The concept of entropy in information theory is introduced into PCA,and Robust Principal Component Analysis based on Discriminant Information(RPCA-DI)is designed.RPCA-DI constructs a sample description model based on entropy regularization,reveals the membership relationship between the sample points and the principal space and non-principal space,uses the membership relationship to describe the subspace structural characteristics of the data,and by highlighting differentiate to effectively eliminate the effects of noise and abnormal sample points.(2)Separability based Adaptive Weighting Principal Component Analysis(SAWPCA)is proposed.Fuzzy c-Means(FCM)is used to highlight the separability of reliable samples from noise points.By introducing fuzzy factors to analyze the degree to which data belongs to different spaces,and highlight the potential separability of data in different spaces,an adaptive weighting model is further constructed to realize fuzzy description of data potential separability and iterative learning of principal components and Portfolio Optimization.(3)Solve the problem that the graph matrix cannot reflect the real data structure by constructing two adjacent graphs to represent the original structure with data similarity and diversity,and impose rank constraints on the corresponding Laplacian matrix to construct a novel self-adaptive graph learning technology is Locally Sensitive Discriminant Unsupervised Dimension Reduction(LSDUDR)algorithm.(4)Integrate adaptive graph learning and feature learning into a unified framework and propose Kernel Alignment Unsupervised Discriminative Dimensionality Reduction(KaUDDR).Firstly,two kernels are defined as the projection data kernel and the similarity kernel.The essential structural characteristics of the data are effectively captured by measuring the consistency between the projection data kernel and the similarity index kernel.Secondly,graph learning and dimensionality reduction processes are performed simultaneously to ensure the optimality of graph learning in the proposed algorithm.
Keywords/Search Tags:Unsupervised dimension reduction, Principal component analysis, Graph embedding
PDF Full Text Request
Related items