Font Size: a A A

Research Of Data Mining Algorithm Based On Clustering And Kernel Method

Posted on:2013-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:X HangFull Text:PDF
GTID:2248330377455227Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the widely use of information systems, a large amount of data is generated, people are very concerned about how to make full use of data and discover useful knowledge. Clustering is an important data mining method. With the proposed of kernel method, kernel method is now applied to the clustering algorithm to help people solve problems and it becomes an important new data mining method.This paper mainly research two classical clustering algorithms:K-means algorithm and FCM algorithm. First, we research and analysis the K-means algorithm and propose an improved k-means algorithm. Secondly, based on the FCM algorithm and combining kernel method theory, proposed two new clustering algorithms which based on kernel method. The main work as follows:1. An improved k-means clustering algorithm based on the optimization of initial centroids and iterations is proposed. It’s not a random selection of initial centroids but the initial centroids selection method of using the minimum distance. The method of adjusting the cluster centers is not taking the average value but taking the nearest point from the average value, thus it overcomes the impact of isolated point. At last, with the optimization of the iterations, algorithm’s computational efficiency is improved;2. A new fuzzy c-means algorithm based on hybrid kernel function is proposed. We construct a hybrid kernel function with strong capabilities of extrapolation and interpolation, and it used in the FCM algorithm. Experiment results show that it has a good clustering result when the boundaries of the samples are linear inseparable or data distribution is non-spherical;3. Considering the advantages of the possibilistic c-means(PCM) algorithm has good noise immunity, we combined it with the hybrid kernel function, and proposed a new PCM algorithm based on hybrid kernel function. The algorithm overcomes the noise data for non-spherical data clustering effects. Experiment results show that the proposed algorithm have a good clustering results for non-spherical data with noise.
Keywords/Search Tags:data mining, clustering algorithm, kernel method, k-means, fuzzy c-means
PDF Full Text Request
Related items