Font Size: a A A

Research And Implementation On Clustering Algorithm Based On Internal Constrained Multi-view K-means

Posted on:2019-11-08Degree:MasterType:Thesis
Country:ChinaCandidate:C H JiFull Text:PDF
GTID:2428330593450435Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As more and more multi-view data are collected,how to apply the traditional clustering algorithm to multi-view data has been studied widely.Among them,the K-means clustering algorithm is extended because of its efficiency on large-scale datasets.Based on the K-means clustering algorithm and the multi-view data without domain knowledge,this paper presents a clustering algorithm based on internal constrained multi-view K-means(ICMK).This algorithm can form a multi-view interactive structure automatically,and then achieve higher quality clustering results without any domain knowledge.Firstly,based on multi-view datasets and "view centroids" "god centroids" which are proposed in this paper,the ICMK algorithm can form a multi-view interactive structure automatically.The basic thoughts of this step is as follows: when clustering multi-view datasets with unsupervised algorithm,different views are equivalent to "observing" or "describing" samples in different views.It can be considered that most views are right,i.e."the majority is subordinate to the majority".But if most views are irrelevant or even opposite to the target of clustering,this algorithm may not be applicable.In this step,the views will be ranked according to the characteristics of the multi-view datasets by using the K-means clustering algorithm.The first view which ranking the first place can be considered as the main view.Through evaluating this algorithm on standard datasets and comparing with the actual view relationships of the multi-view datasets,the effectiveness of the algorithm is verified.Secondly,by labelling the confidence samples and modifying the seed samples,the traditional unsupervised K-means clustering algorithm is improved to two algorithms,i.e.the improved unsupervised K-means clustering algorithm and the improved semi-supervised K-means clustering algorithm.Through evaluating this algorithm on standard datasets and comparing with the results of the traditional unsupervised K-means clustering algorithm and the expected results of the improved algorithm,the effectiveness of the algorithms is verified.Finally,based on the multi-view interactive structure and the improved unsupervised K-means clustering algorithm and the improved semi-supervised K-means clustering algorithm,the confidence of the samples are modified in the view and passed between views.The confidence of all the samples form the confidence matrix.The confidence matrix represents the clustering results of the multi-view datasets.To verify the effectiveness of the algorithm,this paper also evaluates the proposed method on three standard datasets,and compares with some baseline methods.The experiment results show that ICMK can produce higher quality clustering results.It is worth mentioning that the ICMK algorithm has a great effect on the standard WTP datasets.And according to the characteristics of the parameter and the clustering results of the standard datasets,this paper continue to set up experiments in order to enlighten the following problems: how to select the optimal parameters in the ICMK algorithm.
Keywords/Search Tags:Multi-view, clustering, internal constrained, K-means
PDF Full Text Request
Related items