Font Size: a A A

Design And Implementation Of Incremental Clustering Algorithm

Posted on:2010-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:S L WangFull Text:PDF
GTID:2178360302467886Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
With rapid development of network technology and database technology, the amount of information grows rapidly in various fields and the data scale becomes more and more huge. Besides the increasing demand for real-time processing leads to the emergence of massive dynamic data. Most of traditional clustering algorithms consume large amounts of time and space resources and they are of weaknesses in effectiveness and scalability. In this case, the emergence of incremental clustering algorithms can slove the problems effectively.In order to achieve dynamic, incremental clustering, this paper analyzes Clustering Algorithm Based on Density and Density-reachable (CADD) at first, and then makes some improvements. The improvements include three aspects: 1. Set signs for density-reachable members. Aim is to improve efficiency. 2. Improve method for calculating radius and density which avoid double counting. 3. Achieve visualization to assess clustering results effectivly. Experimental results show that improved CADD algorithm reduces the complexity.Based on improved CADD algorithm, the paper focuses on the following two aspects:(1)proposes Incremental Clustering Algorithm Based on Density and Density-reachable (ICADD) according to the characteristics of CADD algorithm. The algorithm uses non-batch mode. It is less efficient. (2) proposes Incremental Clustering Algorithm Based on Subcluster Feature (ICSCF), which is based on the notion of clustering feature of BIRCH. ICSCF algorithm brings forward the subcluster similarity criterion. The criterion includes spatial location similarity and spatial distribution similarity which provides the basis for judging the subcluster similarity. In addition, the algorithm introduces sampling technique when calculating density.Theoretic analysis and experimental results demonstrate that ICSCF algorithm has higher clustering efficiency, because of useing batch mode. At the same time, it can handle large databases through partition and has good scalability. It plays an important role in spatial clustering, such as image processing.
Keywords/Search Tags:Clustering algorithm, Dynamic and incremental clustering, Subcluster feature, Subcluster similarity criterion, Spatial clustering
PDF Full Text Request
Related items