Font Size: a A A

Incremental Data Mining Algorithm Research Based On Fuzzy Clustering

Posted on:2005-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:G Y NiFull Text:PDF
GTID:2168360152469233Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Clustering analysis is a hot topic in the research of data mining. Fuzzy clustering algorithms studied extensively at present are only suitable to the static data set. Regarding the dynamic data set, previous clustering results will become unreliable after new data appear, but it will certainly decrease efficiency and waste computing resource to cluster all of the data again using these algorithms. In view of these questions, an incremental clustering algorithm based on fuzzy similarity degree is proposed in this paper. In addition, two incremental clustering algorithms are also researched based on C-means and grid.The incremental clustering algorithm based on fuzzy similarity degree first checks incremental data one by one, then calculates the similarity between incremental data and formerly clustered data, and compares it with given threshold by the algorithm, so the algorithm will obtain the same clustering results as those obtained using the whole data, which avoids transitive closure calculation of fuzzy similarity matrix. The algorithm is proved to be equivalent with the commonly-used algorithms, such as maximum spanning tree, transitive closure, etc., through the production process of the maximum spanning tree in the maximum spanning tree algorithm. Relative experiments also confirm the equivalence of the algorithm. Compared with traditional algorithm, incremental clustering algorithm increases the efficiency. Moreover, two incremental algorithms are also researched and implemented in this paper, one is based on C-means clustering algorithm through melt into prior knowledge, and the other is based on grid algorithm through quantifying the space.With the data increasing constantly in the knowledge discovery in large database, incremental clustering technique can't merely make best use of the former clustering results and improve the efficiency of clustering analysis, and also bring to the reduction of enormous expenditure on knowledge base maintenance.
Keywords/Search Tags:Clustering, Incremental algorithm, Fuzzy set, C-means, Grid
PDF Full Text Request
Related items