Font Size: a A A

The Research And Comparative Analysis Of Cluster Validity Index

Posted on:2017-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:S S HouFull Text:PDF
GTID:2348330566957465Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Clustering is one of the most important research methods in data mining,the rationality of clustering is significant with the prior determination of the cluster numbers,Cluster validity index can help to scale the validity of clustering and determine the best numbers of datasets.In this paper,a new cluster validity index is proposed and verified by a series of experiment in the base of comparing and analyzing the given validity indices.First of all,this paper study and research some theoretical basis about clustering analysis.Learn some of the most popular clustering algorithms,analyze and compare their advantages and shortcomings,research the theory,procedure and capability of the fuzzy c-means?FCM?clustering algorithm because it is the basics of most validity indices.Secondly,do some research about cluster validity indices,learn some theoretical basis about cluster validity indices and prove the importance of knowing the assured cluster numbers before clustering with experiments.make some comparison by using some popular validity indices so as to prepare for the proposed validity index.Aiming at the problem that fuzzy clustering need to know the best cluster numbers,a new validity index named new validity index Vnewfor fuzzy clustering based on belong proportion is proposed in the reference of existing cluster validity indices.Firstly,propose a basic validity index after full consideration of the given dataset's partition matrix and geometrical structure by redefining the separation distance between different clusters.Secondly,define the concept of belong proportion to enlarge the calculation value.Finally,introduce cluster number to restrain the index because belong proportion may cause excessive influence.The new validity index Vnewis proved to be more reliable and have a higher accuracy compared to the classical indicesbecause it still makes right decision even when the given dataset is overlapping,so the new index has some value to popularize.
Keywords/Search Tags:Fuzzy clustering, fuzzy c-means(FCM), cluster validity index, separation distance, partition matrix
PDF Full Text Request
Related items