Font Size: a A A

Research On Weighted Kernel FCM Algorithm With Double Variables And Its Validity Evaluation

Posted on:2019-08-12Degree:MasterType:Thesis
Country:ChinaCandidate:G Y FengFull Text:PDF
GTID:2428330548485956Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the age of big data,it is very important to quickly convert massive data into valuable information.Clustering has become an important processing strategy.In the field of clustering,the classical FCM algorithm can not achieve good results for the data set without globular clusters.For the nonlinear data,although it can be solved by mapping to high-dimensional feature space,however,in the high-dimensional feature space,kernel function selection is a complicated problem.In this dissertation,a Weighted Kernel FCM Algorithm with Double Variables is proposed,which satisfies the requirements of different data sets and images for different kernel functions.Under unsupervised condition,it can not only divide linear data,but also divide nonlinear data.In addition,how to evaluate clustering algorithm is a classic problem in clustering field.It makes sense to evaluate the results correctly.Furthermore,the effectiveness evaluation obtains the optimal cluster number by establishing a validity index.Now some indexes only consider the validity of the evaluation index data set structure,and others only consider the membership degree,in which these two types have defects.It can obtain better effect when we consider both the structure of the data sets and membership degree.This dissertation presents an index for complex data sets,which can be used to illustrate the advantages of this index from both theoretical and experimental results.The main work and innovation are as follows:(1)Several classical fuzzy clustering algorithms are analyzed.The fuzzy c-means clustering algorithm,the probability clustering algorithm and the nuclear clustering algorithm are discussed.This dissertation expounds the state-of-the-art of improved FCM algorithm,and analyzes the corresponding deficiencies.These include sensitivity to outliers or noise,and the problem of low classification accuracy in high dimensional space problems,and consistent clustering,and finally the choice of kernel function.(2)A Weighted Kernel FCM Algorithm with Double Variables is proposed.The proposed algorithm effectively uses the advantages of the possibilistic clustering algorithm and the fuzzy clustering method.Then it can avoid the problem of kernel function selection.Following that,it refrains from the problem of which the FCM algorithm is sensitive to outliers.Finally it avoids the problem of generation consistency for the PCM algorithm.Furthermore,by mapping low-dimensional data to high-dimensional space,nonlinear data operation is performed to improve the robustness of the algorithm.(3)A fuzzy clustering validity index for complex data sets(targets for geometric structure and cluster size)is proposed.It gets compactness measure strategy with square error within class and the weight of membership degree.Then it derives separability measure strategy by the minimum distance between cluster centers,and the clustering center to average clustering center distance.Finally a new indicator for fuzzy clustering validity is established.Besides,the optimal cluster number can be automatically obtained by the number of categories corresponding to its extreme value.
Keywords/Search Tags:Fuzzy clustering, possibilistic clustering, multiple kernel function, clustering analysis, validity index
PDF Full Text Request
Related items