| With the continuous progress of data collection technology,human beings have ac-quired more and more high-dimensional data.These data can provide more comprehensive and rich information,but they also bring problems such as”dimensional disaster”,which makes it difficult for traditional data processing methods to effectively reveal the informa-tion inherent in the data.Therefore,in the fields of machine learning,image processing,data mining and computer vision,how to design efficient methods for analyzing and pro-cessing high-dimensional data has become an important topic of current research.To this end,this paper investigates the methods of dimensionality reduction and clustering of high-dimensional data based on distance metric learning,and the specific research work is as follows:Firstly,we propose a semi-supervised dimensionality reduction method based on7)2,-norm distance.The method address the problem of the lack of reasonable feedback from the low-dimensional representation on the construction of the distance metric matrix in the dimensionality reduction method based on distance metric learning.The method de-signs the low-dimensional representation and the distance metric by interaction terms to achieve the joint learning of the dimensionality reduction matrix and the distance metric matrix.Meanwhile,the proposed algorithm is further extended with kernels in order to cope with more complex nonlinear data.The experimental results show that the proposed method exhibits effectiveness and robustness in KNN classification,which is significantly improved compared with other classical distance metric learning and dimensionality re-duction methods.In addition,to address the problem that the effectiveness of the traditional distance-based clustering algorithm decreases in the distance metric in high-dimensional space,we propose a local affine hull distance-based clustering algorithm,which aims to improve the way of clustering distance metric in high-dimensional space.Specifically,we divides the high-dimensional sample space into multiple local affine hull and uses the distance between unknown samples and affine hull to obtain the similarity between samples.Meanwhile,the concept of uncorrelated subspaces is introduced to incorporate the idea of discriminant analysis into the clustering framework.In this framework,clustering generates class labels for the affine model,while the affine model provides subspaces for clustering.To test the effectiveness of the proposed method,comparative experiments with some existing clustering methods are conducted in this paper.The experimental results show that the clustering algorithm based on local affine hull distance has better clustering effect in high-dimensional space and can better solve the distance metric problem in high-dimensional dataset.In order to further optimize the clustering algorithm based on local affine hu ll dis-tance,we propose a clustering algorithm based on hyperdisk distance.The algorithm uses the hyperdisk generated by affine hu ll an d hy persphere to st rictly co nstrain th e position of samples in the subspace to achieve a more compact approximation of class regions.Specifically,w e u se t he h yperdisk a s a l ocal a pproximation o f t he s ample a nd redefine the distance metric in this way to achieve a more efficient ex ecution of th e clustering task under the subspace.Finally,in this paper,the proposed algorithm is experimentally compared with some existing clustering methods.The experimental results show that the clustering algorithm based on the hyperdisk distance exhibits high performance in terms of clustering accuracy. |