Driven by science and technology,the way of obtaining information and storing information has been greatly developed,therefore,a large number of high-dimensional data inevitably appear in many fields.There is a large amount of information in the high-dimensional data,but not all the information is valuable and processing it directly will bring many problems,mainly reflected in the calculation is more complex,need more storage space,the recognition accuracy is not high.The main task of dimensionality reduction is to map high-dimensional data to a low-dimensional space that maintains the inherent structure of the data itself which can effectively solve the above problems and has been attracted great deal of attention by researchers.In the practical applications,it is very noble and difficult to obtain enough labeled samples,compared with a large amount of unlabeled data which is easy to obtain.If the supervised dimensionality reduction method is used without excessive labeled data,it may lead to overfitting of the model.On the other hand,if the unsupervised method is used,the value of labeled data is ignored.Therefore,semi-supervised dimensionality reduction method has been widely studied and applied.Among them,the graph-based semi-supervised dimensionality reduction method is simple and easy to understand,which has gained more attention.The traditional graph-based dimensionality reduction method needs to define a graph structure beforehand,and the subsequent dimensionality reduction process depends on the pre-defined graph structure,that is to say,the dimensionality reduction process is separated from the learning of graph structure.Therefore,the learned graph structure may not be the optimal one,which leads to the final result is not ideal.Aiming at this problem in traditional graph-based semi-supervised algorithm,we have done some corresponding research and improvement in this thesis.The main work of this thesis is as follows:(1)The algorithm based on the adaptive structured optimal graph uses the class information of labeled data to intuitively find the nearest neighbor points for each known labeled sample,and mine the local structure information of the data to prevent the influence of noise or outliers.Then,a regularization item that represents structural information between samples is constructed according to all the training data,so that the supervised method is extended to semi-supervised field,that is,for the whole sample set,adaptive learning method is adopted to adjust the nearest neighbor of the sample.At the same time,we hope that the graph structure we learned is sparse and has a clear structure,that is,the number of connected components in the graph is exactly the number of the data clusters/classes.Such a structured graph would be beneficial to many tasks since it contains more accurate information of data,so the structural constraint is added to the graph structure.Experimental results on synthetic and real data sets verify the performance of the proposed algorithm.(2)Based on orthogonal least squares discriminant analysis,a new adaptive semi-supervised dimensionality reduction method is proposed,which is called adaptive flexible discriminant analysis.This method obtains great inter-class discriminant analysis power by making the data points of the same class close to the center of the sample of the class.In addition,we continue to use the adaptive neighbor idea in the previous method to learn graph structure,but in general,in the process of using adaptive neighbor to learn the graph structure,we use linear projection to represent the relationship between the original training sample and the low-dimensional representation,which has some shortcomings in dealing with nonlinear data.Therefore,the linear projection constraint can be relaxed to estimate the nonlinear manifolds close to linear embedding by adding a regularization term.This method not only finds the nonlinear embedding,but also estimates a linear projection that can directly applied to the new sample.The experimental results show the effectiveness of introducing the idea of flexibility manifold embedding on the basis of adaptive neighbor learning. |