Font Size: a A A

Effectiveness Index Of Fuzzy Clustering And Its Application In Clustering Number Determination

Posted on:2022-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:J J HuangFull Text:PDF
GTID:2518306560955259Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
In the age of big data,data is accumulated everywhere in daily life,which leads to the accumulation of a large number of data continuously.The intrinsic value of these data is the driving force for many scholars to study clustering algorithm.Clustering effectiveness index is used to evaluate the clustering results,which plays a key role in finding the correct cluster number.However,existing indicators have problems such as difficulty in obtaining the correct cluster number when the cluster center distribution is very close,the separation processing mechanism is too simple and poor effect for data sets containing noise.For this reason,we propose two new clustering effectiveness indexes for fuzzy clustering: Triple Center Relation(TCR)and THI(Tang-Huang Index).The main works and innovations are as follows:(1)The fuzzy clustering algorithm is briefly introduced.Because the study of clustering effectiveness index is based on clustering algorithm,the data set samples should be analyzed by clustering first,and the data set is divided into different clusters by using tools.Then the clustering effectiveness index function is used to evaluate the clustering results.And then know whether the clustering algorithm used is effective.Fuzzy C-means(FCM)clustering algorithm is used in this paper.(2)A TCR index named triple center relation is proposed.The index is described from two aspects of traditional compactness and separation.A new fuzzy cardinality is established based on the maximum membership degree of TCR index,and a new compactness formula is established by combining TCR index with intra-class weighted square error and sum.Starting from the minimum distance of the clustering center,the mean of the distance of the clustering center and the sample variance of the clustering center in the statistical sense are integrated,and the three are compounded in the way of product to form a more three-dimensional separation expression mode.Experiments are carried out on UCI data set and artificial data set,and the advantages of the new index are explained from both theoretical and experimental aspects.(3)A new index THI index is proposed.The index describes compactness and separateness from a new direction,summarizes and improves the punctuation of previous indexes,and makes the calculation of indexes more convenient and better results can be obtained even in the case of complex data sets.Both good and bad indicators need real data to be tested,so this indicator is conducted on two types of data sets,real and artificial.The experimental results show that the newly proposed indicator has better evaluation significance.
Keywords/Search Tags:Fuzzy clustering, validity index of fuzzy clustering, compactness, separability, Fuzzy C-means(FCM)
PDF Full Text Request
Related items