Font Size: a A A

Research And Implementation Of Subspace Multi-Clustering Model Based On Granular Computing

Posted on:2013-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:L L HuangFull Text:PDF
GTID:2248330371496856Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Along with the growth of information, complexity of the data is also increasing and then the requirements of clustering algorithm are becoming more and more complex. For example, gene expression analysis, sensor surveillance, text analysis and customer segmentation. The traditional clustering algorithms are generally looking for a single clustering solution of the data, and it can’t well meet the application requirements any longer. Then we begin to focus on multiple clustering solutions. Informally, we can explain multiple clustering solutions as an approach providing more insights than only one solution and each of these solutions provides additional knowledge. Users may choose one or multiple of these solutions.Three different models of subspace multiple clustering are proposed in this paper. Different strategies are taken in the first two models respectively. The first model adopts the ENCLUS algorithm to find significant subspaces, and then finds clusters making use of K-Means Algorithm in these remarkable subspaces respectively. Then the concepts of the cost of a clustering and the clustering’s similarity are proposed which are the main innovation points in this model. Using these concepts, the model makes a reduction of the clustering obtained before. In this step, by selecting the similarity threshold, different size of result sets can be obtained, in other word, the parameter can control the roughness of the final result set. The second model is based on K-Means clustering of the data set in each single attribute spaces. After that the model marks all of the objects according to the clustering results. Then through the comparison of objects we can find similarity between objects and then some special clusters. Combining the thought of granular computing, the paper builds up a third model of granular computing based subspace multiple clustering. This model is summary and expansion of two models proposed before at a higher academic degree.Experimental results on the simulation customer file that when similarity threshold is2, simplified rate is65%show that model one can obtain a well reduced result and then make the intelligibility of the clustering result more easily and model two can obtain some special clusters which are hard to find when using other multiple clustering algorithms. The third model based on granular computing can provides a better guidance for the next job.
Keywords/Search Tags:Multi-Clustering, Granular Computing, Subspace Clustering, Cost of AClustering
PDF Full Text Request
Related items