| Clustering is a basic task in the fields of machine learning,pattern recognition and data mining.With the rise of the Internet and big data,the same object can be obtained from multiple channels or described from multiple perspectives,thus forming the so-called multi-view data.Generally speaking,multi-view data has the advantages of consistency,complementarity and difference.In order to make full use of the information contained in multi-view data,multi-view clustering comes into being.In recent years,various of Multi-view Clustering(MVC)algorithms have been proposed,among which the multi-view subspace clustering algorithm based on self-representation is an important MVC algorithm.However,the existing algorithms seldom consider the flexibility of selfrepresentation learning,the noise and outliers in the data,and the effect of metric methods on the similarity matrix,thus affecting the performance of the algorithms.To solve the above problems,we improve the existing MVC algorithm and propose several improved algorithms.The main work and innovation points are as follows:(1)A novel Double Structure Scaled Simplex Representation for Multi-view subspace clustering(DSSSR)is proposed.When there is noise in the data,the similarity matrix learned by the existing MVC method is not clear and accurate,and the flexibility of the similarity matrix is poor,which leads to the unsatisfactory clustering performance.Therefore,DSSSR algorithm is proposed in this thesis.First,the multi-view data is vertically concatenated into the same matrix(i.e.,the cascade matrix).Then,SSR algorithm was applied to the spliced data to obtain the similarity matrix.However,the obtained similarity matrix is not clean and accurate due to the noise of data and the different statistical attributes of different views.Therefore,SSR method was used again to obtain a more accurate and clean similarity matrix.In addition,SSR of the above two steps was integrated into a unified optimization framework,and the sum of each column vector of the similarity matrix was constrained to be s(0<s ≤1).The optimal clustering performance was obtained by adjusting s,thus increasing the flexibility of the algorithm.Finally,an objective function optimization algorithm based on the Alternating Lagrange Method(ALM)is designed.Experimental results on some datasets show that DSSSR has a relatively better clustering performance compared to some existing algorithms.(2)Matrix factorization and consensus scaled learning for multi-view clustering(MCMVC)is proposed.Most existing multi-view subspace clustering algorithms based on self-representation directly minimize the reconstruction error to learn the consistency similarity matrix,but do not fully consider the differences between multi-view data and different statistical characteristics,which makes it suboptimal and cannot fully represent the internal clustering structure of multi-view data.In addition,the similarity matrices obtained by the existing multi-view algorithms are not flexible,which will also limit the clustering performance to a certain extent.Therefore,a novel MCMVC algorithm is proposed in this thesis.Firstly,the reconstructed sample matrix is decomposed to obtain a better representation matrix,which is used to learn the consensus matrix.Then,the scaling projection is applied to the consistency matrix,and the parameter s can be adjusted to obtain the best clustering performance.Finally,an enhanced Lagrange method based on alternate method is designed for algorithm optimization.Experimental results on several data sets show that our algorithm is superior to some existing algorithms.(3)Multi-view Clustering based on a Multi-metric Matrices Fusion method(MVC3MF)is proposed.Existing MVC methods usually only use a single metric to learn graph matrix,which cannot fully reveal the real structure between complex samples,making the clustering performance unsatisfactory.Specifically,MVC3 MF first vertically concatenates multiview data into a matrix;Then,a variety of distance measurement matrices are learned based on the spliced matrix.Secondly,MVC3 MF uses adaptive weight method to fuse the obtained multi-metric matrix into the optimal metric matrix.Finally,an optimal metric matrix is used to learn the similarity matrix,and a rank constraint is applied to the similarity matrix to obtain the optimal block diagonal structure.Experiments are carried out on some data sets,and the results verify the effectiveness of the MVC3 MF algorithm. |