Font Size: a A A

Fuzzy Clustering Based On The Average Of The W-distance Discretization

Posted on:2013-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:R L ZhangFull Text:PDF
GTID:2218330374463627Subject:Computer applications
Abstract/Summary:PDF Full Text Request
For data mining and machine learning, discretization of continuousattributes can not only reduce the time and space complexity of algorithm, butalso improve the learning accuracy and clustering of algorithm, and enhancesystem's anti-noise ability. Nowadays many data mining and machine learningalgorithms can only deal with discrete data, therefore continuous attributediscretization is very necessary. Most discretization algorithms can achievedesired effect to a certain extent, but the discretization process is often ignoredthe true nature of "He also vital", thus the result of discretization is notreasonable. A fuzzy clustering algorithm based on the w-distance mean and adiscretization method are studied in this paper. The main work is as follows:1) A fuzzy clustering algorithm (w-MDFCM) based on the w-distance meanis presented. Firstly, initial clustering centers are determined by making use ofthe idea of the mean distance according to the distribution of data set, and theregulating factor w is introduced to adjust the mean distance. Secondly, eachsample object in data set is assigned a weight, and the clustering center formulaand target function formula are modified by the weight, so that the anti-noiseperformance is greatly improved for the algorithm. In the end,the experimentalresults validate that the algorithm has good effect on selecting initial clusteringcenters, avoiding local convergence and having higher performance ofanti-noise and effectiveness.2) On the basis of above, an algorithm for continuous attributesdiscretization based on w-MDFCM is presented. The algorithm uses compatibledegree concept in rough set theory to adjust dynamically the parameters of theclustering in order to maintain the consistency level of the information systembefore and after discretization. Finally, the experimental results validate theeffectiveness of the algorithm on the UCI data sets.
Keywords/Search Tags:Discrete, w-distance mean, Fuzzy clustering, Compatibility degree, Regulating parameters
PDF Full Text Request
Related items