Font Size: a A A

Research On The Evaluation Methods Of Cluster Analysis Results

Posted on:2015-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y HuFull Text:PDF
GTID:2298330422490111Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining is currently in big data analysis for a hot technology; Cluster Analysis is a hot technology of data mining, which is a unsupervised clustering analysis through detail analysis for the clutter data and is widely used; which contains the various data types for a variety of clustering algorithms, which makes many of the computer staff attention and statisticians; Although various algorithms which have been widely used;No one algorithm for all data types are common, clustering quality evaluation index is very Important to judge a good clustering effect; So clustering is still relatively difficult and evaluation in terms of computing.The different types of data sets are clustered based on PolyAnalyst through K-means algorithm; The clustering results are visual, The K-means test conducted in-depth analysis, visualization, and supplemented by a large number of scatter plots, According to the example of FIG clear and distinctive results, this paper proposes a novel combination of index evaluation method to verify the quality of the effect of clustering. First, the combination of the concept, which is different from the traditional index evaluation index, which uses the traditional with the new and improved index combines indicators to evaluate the quality of clustering results; Second, graphics and color share percentage concept, classes and class color percentage threshold to evaluate compliance; Third, the concept of dispersion, it is for the calculation of the overall and local evaluation.Finally, According to the demand, the article is based on the clustered result of the different types of dataset comparing the scattergram; Experimental results on different types of data sets show that the combination of clustering algorithm to evaluate the quality issues raised are valid and available. Experiments show that, K-means model assessment method and the combination method of this paper to deal with the real data sets for clustering are effective, high availability, clustering results interpretability well.
Keywords/Search Tags:cluster analysis, combination assessment, K-means algorithm, PolyAnalystsoftware
PDF Full Text Request
Related items