Font Size: a A A

Research On Noise And High Dimensional Problems In Clustering

Posted on:2007-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:T ZhouFull Text:PDF
GTID:2178360185995925Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,as one of the important data mining tools,clustering technology gets more and more attentions.Nowadays,many clustering algorithms have been successfully applied in various related fields.However,most of the clustering algorithms are only effective to low dimensional datasets and can not work on the increasing high dimensional datasets in different areas effectively.How to efficiently cluster the high dimensional data is a difficulty and hot topic by now.One of the difficuties of clustering high dimensional dataset is the high time complication,which make some algorithms difficult to be implemented.Another difficuties of clustering high dimensional dataset is its high sensitivity to noise,which also makes most traditional clustering algorithms,such as k-means,HC ets,ineffective to high dimensional dataset.So,it is very necessary and interesting to propose a fast and robust clustering algorithm to high dimensional datasets.Projected clustering algorithms are a big category clustering algorithm presented to high dimensional dataset.The lots of experimental and theoretical results demonstrate that it is much effective compared with the traditional clustering algorithms.To overcome the difficuties mentioned above of clustering high dimensional dataset,a fast and robust projected clustering algorithm is presented here.The clustering algorithm first utilize the association rule method to get the relevent dimensions of each cluster,and then further adopt these relevent dimensions to find the proper clusters.The main advantages of the proposed algorithms are as follows:1. fast and effective2. robust3. automatically get the cluster numberOur simulated experiments demonstrate the above advantages.
Keywords/Search Tags:data mining, clustering analysis, high-dimensional data, projected cluster, association rule, relevant dimension
PDF Full Text Request
Related items