Font Size: a A A

An Improved LFCM Algorithm Based On Iterated Entropy Weight And Its Application In Imbalanced Data Sets

Posted on:2019-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:R Y ZhangFull Text:PDF
GTID:2348330569479971Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Clustering analysis algorithm is a classical unsupervised machine learning algorithm.Literal Fuzzy C-Means(LFCM)clustering analysis algorithm is a better partition clustering analysis algorithm.It uses fuzzy membership matrix to determine the degree of each input sample belonging to a cluster center.It determines whether to stop iteration according to whether the difference between the new and old clustering centers is less than the set threshold.Although the algorithm is simple and easy to understand,it also has a fast clustering speed for a small number of data sets,but it also has a low accuracy of clustering.In addition,in recent years,many industry application system data have gradually presented the characteristics of imbalanced data sets,but the existing LFCM algorithm is used in the clustering analysis of imbalanced data sets,and there is a defect that the clustering effect is not ideal.This poses a hidden danger to the analysis of subsequent data.Therefore,how to improve the effectiveness of clustering algorithm for imbalanced data set has been a hot topic in recent years.As a cluster analysis method,entropy method can calculate the confusion degree of each attribute by calculating the entropy of eachattribute.The weight of each attribute is calculated according to the information entropy to realize the clustering analysis function of data.In this paper,the entropy weight method is applied to the selected 8 UCI data sets to complete the clustering experiment.Experimental results show that compared with LFCM algorithm,the average clustering accuracy of entropy weight method is increased by 8.9%.It shows that entropy weight method has better effect on clustering accuracy than LFCM algorithm.However,entropy method does not consider the confusion of data after the end of each clustering,so that the clustering effect is not perfect enough.For this reason,this paper proposes an improved LFCM algorithm based on iterated entropy weight,which is further improved on the basis of entropy weight method.The improvement of the algorithm is as follows:(1)data matrix is obtained by combining input samples and membership functions.According to the data matrix,new information entropy and weight are obtained.(2)use the difference between the new and the old weights to be less than the set threshold as an iterative stopping condition.Subsequently,the improved algorithm is applied to the 8 sets of clustering analysis data sets to complete the clustering experiments to verify the clustering performance of the proposed algorithm.Experimental results show that,compared with LFCM algorithm,the average clustering accuracy of LFCM algorithm based on iterative entropy is increased by 10.2%.Comparedwith entropy weight method,the average clustering accuracy of LFCM algorithm based on iterative entropy is increased by 1.3%.The results show that the improved LFCM algorithm based on iterative information entropy weight improves the accuracy of clustering effectively compared with LFCM algorithm and entropy weight method.Finally,the improved LFCM algorithm based on iterative entropy weight proposed in this paper is applied to the imbalanced data set to verify its performance of clustering analysis for imbalanced data sets.The experimental results show that the average purity,compactness and separation of the improved LFCM algorithm are 7.9%,0.2282 and 0.962 respectively compared with the LFCM algorithm.Compared with entropy weight method,the average purity,compactness and separation of the improved LFCM algorithm increase by 0.6%,0.2899 and 0.524 respectively.Compared with LFCM and entropy weight,the improved LFCM algorithm based on iterative entropy weight proposed in this paper can effectively improve the clustering purity,compactness and separation in the clustering analysis of imbalanced data sets,and improve the clustering effect.
Keywords/Search Tags:cluster analysis, LFCM, entropy weight method, imbalanced data set
PDF Full Text Request
Related items