Font Size: a A A

Research On The Extension Of Fuzzy C-means Clustering Algorithm

Posted on:2020-09-15Degree:MasterType:Thesis
Country:ChinaCandidate:L W ZhuFull Text:PDF
GTID:2428330599961201Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Cluster analysis aims to partition the set of data points into meaningful subsets according to specific criteria,so as to find the intrinsic structure of the data.The cluster analysis method belongs to unsupervised learning,and has been widely applied and rapidly developed in the fields of machine learning,data mining and information processing,etc..Among many clustering algorithms,Fuzzy C-Means(FCM)is a very competitive one,which is based on C-Means.Through the degree of fuzzy membership,FCM generalizes C-Means from “hard” clustering to “soft” case,and the problem of clustering is transformed into the problem of fuzzy partition of data set.Although FCM has received a lot of attentions and researches,there are still some issues needed to be addressed.For example,i)In the objective function of the traditional FCM,the case of class imbalance in the data has not been considered,which leads to that FCM is not suitable to cluster unbalanced data set;ii)FCM is an unsupervised learning method since it doesn't utilize partial prior(labels)information,which makes FCM is out of the current research about semi-supervised learning.Therefore,how to extend FCM into more universal forms will be a good research.In order to cater wider application scenarios,this article studies the extensions of FCM by considering class imbalance case and semi-supervised setting.Concretely,our main works in this article include that1)A balanced FCM(Balanced FCM,BFCM for short)is proposed.Focusing on the1)A balanced FCM(Balanced FCM,BFCM for short)is proposed.Focusing on the disadvantage(the "uniform effect" occurs)when FCM meets with unbalanced data,we add the orthogonal penalty of the fuzzy membership matrix in the objective function of FCM,so as to enforce the balance between "big" and "small" classes.Thus,BFCM is more efficient than FCM on unbalanced data set.2)A semi-supervised BFCM(Semi-Supervised Balanced FCM,SBFCM for short)is proposed.BFCM cannot utilize the priori(labels)information since it is unsupervised.We merge a semi-supervised term into the objective function of BFCM,and thus the partial priori(labels)information can be exploited to guide clustering.Thereby,BFCM is suitable to semi-supervised learning.
Keywords/Search Tags:Fuzzy C-means clustering, Class imbalance, Semi-supervised learning, Orthogonality constraint, Cluster purity
PDF Full Text Request
Related items