Font Size: a A A

Research On Dynamic Clustering And Incremental In Data Mining

Posted on:2016-03-08Degree:MasterType:Thesis
Country:ChinaCandidate:X J LiuFull Text:PDF
GTID:2298330467987311Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Cluster analysis (CA) as an important branch of data mining technology, hasbeen developed rapidly in recent years. Clustering is a set of data in a certainsimilarity criterion which is divided into several groups or classes which arecalled clusters. Objects in every cluster are more similar to each other thanobjects form different clusters. With the development of science and technology,more and more incremental data analysis has became a new attention. Clusteringanalysis technology can quickly and efficiently find space gathering for samples,and can help people find group spatial structure characteristics of the dense andsparse. It has an important significance to reveal the distribution of sample spaceforecast development trend of the sample space object.For clustering analysis of the data mining technology, the research content ofthis article is mainly divided into the following several parts:First, by studying the traditional K-means of dynamic clustering method, wefind the shortcomings of it, that is, the algorithm is sensitive to initialization. Sothis article presents an improved algorithm that based on the high density datacenters that are not unique. By using this method we can select the high densitydata. Then appropriate data centers will be selected from the high density data bycomputing the global center distance. The new algorithm selects the initialcenters which have local representative, so that it can be effectively improve thequality of the clustering.Second, as it is impossible for K-means algorithm to determine the number ofclusters k in advance, the new method that based on the improved k-meansalgorithm and the BWP validity index is proposed. The experimental result showsthat this algorithm can find the right value of k.Third, in view of the fuzzy C-means clustering (FCM algorithm) is easy tofall into local optimum, this paper has improved a new fuzzy c-means algorithm combined a global searching ant colony algorithm. On this basis, to handle thedynamic incremental data this paper has proposed incremental measures. Thismehode has a wonderful effect to the incremental data clustering results andefficiency.Finally, based on the algorithm of DBSCAN, the new clustering methodbased on relative density is presented. On the other hand, for the incrementaldata,this paper used the new clustering method to complete incrementalclustering. The new method could find mixed resolution data sets, and is effectivefor the incremental data clustering.
Keywords/Search Tags:Clustering analysis, K-means, ant colony clustering algorithm, incremental clustering, relative density of the clustering
PDF Full Text Request
Related items