Font Size: a A A

Fuzzy Cluster Validity Research

Posted on:2020-10-31Degree:MasterType:Thesis
Country:ChinaCandidate:J Y GengFull Text:PDF
GTID:2370330578464004Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Clustering is an important research topic in the fields of pattern recognition,image processing and machine learning.Cluster analysis has become a research hotspot in the field of clustering and has attracted extensive attention from scholars at home and abroad.Among them,fuzzy clustering has become an indispensable part of cluster analysis due to the introduction of fuzzy set concept,which can effectively deal with the fuzzy problem in reality.Fuzzy C-Means(FCM)is one of the most commonly used algorithms for fuzzy clustering.FCM algorithm is simple in design,high in operation efficiency,and can effectively process large data sets.It plays a very important role in fuzzy clustering algorithm,but there are still some shortcomings in this aspect,such as the need to give the optimal clustering number in advance,and different fuzzy degree m will lead to different number of clusters results.In view of the above shortcomings,the current verification is mainly through clustering validity to judge the quality of the clustering results.The clustering validity analysis mainly proposes appropriate clustering validity indexes as the judgment basis of the algorithm,but most of the existing clustering validity indexes can only deal with data sets with good separability,and cannot effectively make a correct judgment for data sets with noise pollution and multi-type structure coexistence.Therefore,this paper analyzes from multiple perspectives to find more suitable clustering validity indexes,so that FCM algorithm can effectively process data sets of different structure types without manual intervention.The main research work of this paper is as follows:(1)This paper firstly proposes a new cluster validity index,referred to as the W index,for the defect that the existing cluster validity indexes can not effectively judge the optimal cluster number on the data set containing noise and overlap.The index is measured from three characteristics: compactness,separation and overlap.Among them,the compactness degree introduces the distance between two data subclasses,the separation degree introduces the minimum membership degree,and the overlap degree introduces the product of the squares of the two classes of membership,so that the clustering validity index can reflect the distribution of the data set from multiple aspects,and to a certain extent avoid the interference of noise and overlapping data on the clustering results.The experimental results show that the proposed index can effectively evaluate the clustering results and overcome the influence of noise and overlapping data sets to accurately determine the optimal clustering number of samples.Finally,in the robustness test experiments with different fuzzy degree m,the validity index W shows good robustness.(2)Based on the improved index,further research and analysis found that most of the existing fuzzy clustering effectiveness indexes are generally too dependent on the cluster centroid,making it impossible to accurately judge the data sets containing the adjacent classes and small classes.In order to alleviate this problem,the WS clustering validity index is proposed.By using the maximum and minimum membership degree rule and the fuzzy deviation of the data set,WS index can improve the defect that the index is too dependent on the clustering center to some extent and comprehensively consider the overall information of the data set.The WS index can not only avoid misidentifying the adjacent class as the same class,but also not ignore the existence of small classes,showing better accuracy.Experimental results show that WS index can accurately find the optimal clustering number of the data set and complete the effective clustering under different fuzzy degree m on the data set with large difference in size and density and adjacent classes.(3)Finally,an improved automatic grayscale image segmentation algorithm is proposed by combining the new index and FCM image segmentation algorithm.Experimental results show that the algorithm can accurately obtain the optimal class number of images,so as to complete automatic image segmentation efficiently and quickly.
Keywords/Search Tags:Fuzzy C-Means, cluster validity index, optimal number of clusters, fuzzy degree, image segmentation
PDF Full Text Request
Related items