Font Size: a A A

The Improved Fuzzy Clustering Algorithm Based On Witten Framework

Posted on:2016-07-09Degree:MasterType:Thesis
Country:ChinaCandidate:L LiuFull Text:PDF
GTID:2308330464459169Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Clustering algorithm is one of the important methods of data mining, clustering divides a data set with no pre-classification labels into several clusters based on the their similarities, so the similarities between the objects in the same cluster are high while the similarities are low between the objects in different clusters.The classic algorithms usually be well applied on the data sets with low dimension, but for the high-dimensional data sets, these algorithms are not that effective due to their own deficiencies, even sometimes they are unable to run out with results. In order to solve this problem, in 2012, Witten proposed a sparse clustering framework, this framework select the features by weighting the attributes and using a lasso-type penalty. This approach is effective for processing on large-scale data sets.An improved FCM algorithm is proposed to overcome the the drawback which is easily to fall into local optimum of the original one, it combines the selection operator, crossover operator and mutation operator of GA algorithm to optimize the objective function.The new proposed algorithm combines the concept of Witten sparse clustering framework and the improved FCM to select the features of data and get sparse clusters, while adding space distance instead of Euclidean distance to improve its restriction of data space, because the traditional clustering algorithms generally use the Euclidean distance, but it is only well applied on the spherical spatial distribution, and the spatial structure of real data is varied, so the space distance is used to replace the Euclidean distance in this paper due to its advantage of unlimited to data space and universal adaptability. Thus, we proposed a new algorithm – Improved fuzzy algorithm based on Witten framework.By running on several data sets and test in a variety of evaluation criteria, it shows that the new algorithm is effective for both small and large scale data sets, especially on high-dimensional data sets.
Keywords/Search Tags:Clustering Algorithm, Sparse Clustering, FCM Algorithm, Space Distance
PDF Full Text Request
Related items