Font Size: a A A

The Research On Clustering Algorithm For Categorical Data Using Quantum Mechanics

Posted on:2010-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:Z T ZhaoFull Text:PDF
GTID:2178360275480536Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
In recent years,data mining is becoming one of more active research topics in the information decision-making field.As an effective tool of data mining,cluster analysis is attracting extensive attention.Among many types of clustered data,the categorical data is a kind of common,and its attribute value is limited,disorderly and incomparable.Because of the inherent disorder of categorical data object distribution,a few of clustering algorithms can achieve clustering categorical data.But these algorithms have faults of more or less unstable and poor randomness.Therefore,actively exploring new and more effective clustering algorithms for the categorical data is still an important part of clustering research.In view of this,based on CQC(Categorical Quantum Clustering) algorithm and its problems,the main researching work is given as following:(l)In view of the fact that the CQC algorithm clustering ability is limited because of using traditional Hamming dissimilarity measure and ignoring the categorical value significance and the characteristic association among the values.Therefore,a MCQC (modified categorical quantum clustering) algorithm is proposed by introducing Ahmad dissimilarity measure for categorical data.The experiments adopt pure categorical data set, the binary data set and mixed data set respectively.Experimental results demonstrate that the proposed algorithm is effective and feasible,and that clustering accuracy is significantly improved compared with the CQC algorithm.(2)In view of the fact that the CQC algorithm clustering effect is sensitive to the cluster measure scaleβwhich often depends on the experience without general principle,as well as the CQC algorithm is ineffective to the linear inseparable data,an ICQC(Iterative categorical quantum clustering) algorithm is proposed by introducing the clustering measure scale stepβstep and defining compactness index AIAD.The experiments adopt the linear separable data set and the inseparable data set respectively.Experimental results demonstrate that the proposed algorithm is batter than the CQC algorithm at clustering accuracy and Robustness.(3) In view of the fact that the CQC algorithm and ICQC algorithm are not able to detect the best cluster number automatically and accurately,and are deficient in cluster validity aspect,a CQHC(categorical quantum hierarchical clustering) is proposed by defining cluster validity function CVF based on compactness index AIAD and discreteness index AIED. Experimental results demonstrate that the cluster validity function CVF is reasonable,and the proposed algorithm is effective and feasible,which not only has higher clustering accuracy, but also accurately detects the best cluster number.
Keywords/Search Tags:data mining, clustering analysis, categorical data, dissimilarity measure, clustering measure scale step, cluster validity
PDF Full Text Request
Related items