Font Size: a A A

An Incremental Grid Clustering Algorithm Based On Density-dimension-tree

Posted on:2015-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:J L HuangFull Text:PDF
GTID:2298330431995855Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Due to the large volumes of data arrived quickly and continuously in a stream, mosttraditional clustering algorithms become inefficient in this context. One of the difficultproblems in data mining is how to analysis the real-time data stream with limited storagespace, and obtain the valuable knowledge and information more accurately and moreeffectively. Therefore, the research on data stream clustering algorithms adapting to thecharacteristics of data stream has great practical significance and is very important.By analyzing the advantages and disadvantages of the traditional clusteringalgorithms and data stream clustering algorithms, we propose a new approach to improvethe existing algorithm PDStream. This new approach is an incremental grid clusteringalgorithm based on density-dimension-tree(IGDDT). The algorithm invokes a snapshotmode policy to determine the next time for clustering and saving the snapshots, and reusesthe previous clustering results to update the new results with high efficient. Furthermore,this algorithm can describe clusters more accurately and obtain higher clustering qualitywith a further grid partition strategy. The experimental results on both artificial and realdatasets demonstrate that IGDDT not only discovers arbitrary shape of clusters, but alsohas better performance than the traditional grid-based clustering algorithms on bothclustering accuracy and clustering efficiency.
Keywords/Search Tags:Incremental clustering, Grid, Density-dimension Tree, Data Stream, Quality analysis
PDF Full Text Request
Related items