Font Size: a A A

Application And Research Of Clustering Method In Data Mining Technology Base On Genetic Algorithm

Posted on:2010-12-09Degree:MasterType:Thesis
Country:ChinaCandidate:D B WuFull Text:PDF
GTID:2178360275974453Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The major reason that data mining has attracted a great deal of attention in the information industry in recent years is due to the wide availability of huge amounts of data and the imminent need for turning such data into useful information and knowledge. People can apply the research result of knowledge discovery to the data process that can support the science decision. Cluster analysis is a basic assignment of data mining and a kind of unsupervised learning. The goal of clustering is to partition data set into such clusters that objects within a cluster have high similarity in comparison to one another, but are very dissimilar to objects in other clusters without any prior knowledge. By clustering, one can identity dense and sparse regions,therefore,discover overall distribution patterns and interesting correlations among data attributes.Clustering 15 another teehnologyusedfrequentlyindatamining.K-means algorithm is the most common one among those algorithms.But in the application of this algorlthm,it needs user to input the value of k aimde to cluster.However,user sometimes have no idea about this parameter,besides,different value of k results in different cluster although the algorithm is the same.Another disadvantage in this algorithm is that algorithm needs to produce a random initial cluster center,but different initial cluster center has different result.Sometime if the initial cluster center has been chosen improperly,the result maybe the local optimized result. The mainwork includes:①Introducing and analyzing clustering algorithms and genetic algorithm.This paper introduces the basic concept,tasks and correlative mature methods of data mining,and then introduces and analyses genetic algorithm and the basic concept and familiar algorithms of cluster analysis.②Combination of genetic algorithm and K-means algorithm the advantages of a genetic algorithm based on k-means clustering algorithm and the algorithm is based on the improved genetic clustering algorithm based on the actual situation in the use of cluster variable-length real number that the cluster center, and the design of the new mutation operator of the cross and the adoption of the effectiveness of widely used indicators of cluster DB-Index as an objective function, not only solve the K-means clustering algorithm in clustering the number of difficult determine the initial value of the sensitive and vulnerable to the defects such as local optimization, and algorithm efficiency and accuracy than the previous algorithm algorithm has greatly improved.
Keywords/Search Tags:Data Mining, Clustering, Genetic Algorithm, k-means
PDF Full Text Request
Related items