| With the rapid development of information technology,there are lots of highdimensional data that have multi-source and heterogeneity,having been generated in human life.The important theme is how to dig up useful information from the data with complex structure,and has great scientific research value and social value.Representation learning is a major technology for analyzing high-dimensional data,which aims to extract key information for guiding data mining.It has been successfully applied in many fields,such as computer vision,biometrics,and information retrieval.What is obtained by representation learning is an intermediate representation that requires getting involved in specific downstream tasks to have the practical application value.In this paper,we mainly research the theory and application of representation learning for subspace clustering analysis.In the advanced methods of subspace clustering,the structure information of expressing the affinity between data has attracted considerable attention,involved in representation learning effectively.However,the existing methods of representation learning for subspace clustering have the following weaknesses:(1)the high sensitivity to selected data;(2)unstable clustering results;(3)high time cost because of the iterative calculation.To address these issues,this paper proposes a novel structure-aware-based representation learning model,which can learn the data representation with good discriminability.We further extend the representation learning of subspace to that of dimension reduction,gaining the corresponding low-dimensional embedding through the structure-preserving,which has effectively alleviated the curse of dimensionality in the subspace clustering analysis about high-dimensional data.The main focuses of this paper are summarized as follows:(1)In order to reduce the sensitivity of the algorithm to various data and solve the problem of unstable clustering results due to relatively lower grouping effect,this paper proposes a representation learning model of subspace based on the structure-aware.This model can obtain a structure graph that can comprehensively express the affinity relationship between data by a new joint grouping-measure approach.And the approach leverages the consistency between different structures to help the awareness of intrinsic structure.Finally,the learned structure graph can be used for subspace clustering analysis.Experimental results on the data of biological information,handwritten digit,object image,and speech signal show that the proposed model has better clustering performance than most advanced methods.The improvement in biological information is the most significant in which the optimal results on clustering accuracy and normalized mutual information are improved by 8.24% and 5.18% respectively.It is verified that the proposed model can effectively learn the representation of subspaces with good discriminability and better solve the problem of subspace clustering about data with high-dimensional complex structures.(2)In order to tackle the curse of dimensionality in the subspace clustering analysis about high-dimensional data,on the basis of the structure-aware,this paper presents a data embedding method based on structure-preserving.Different from conventional manifold learning methods that leverage Euclidean distance to measure the similarity of data,the proposed method utilizes the structure representation to depict the consistency between data.Our method integrates representation learning and dimensionality reduction in an optimal framework,which finds the optimal direction for data embedding efficiently.Experimental results that the clustering accuracy of the proposed method can be improved by 12.51% on face images with multiple variations and it demonstrates the consistent optimal performance,which proves the effectiveness of the method in the representation learning of dimension reduction.The proposed structure-aware is a universal representation learning model that has no use for any iterative computation,leading to a lower calculation cost.In the global structure awareness,simply scatter optimization can improve the efficiency of the new algorithm,and reduce the complexity of the model.Furthermore,the focus of the above model is that it can construct a structure graph with the all-around relationship of samples,which has generality to naturally extend to other research fields of representation learning.Thus,it can provide a new view for future research on representation learning based on graphs. |