Font Size: a A A

Research In Data Mining Based On Genetic Algorithms

Posted on:2013-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2248330371986081Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the popularity of the computers and networks, we can easily obtain all theinformation that we care about. In many fields, there is rapid growth of data indifferent forms. People usually can’t obtain the induction of the results by the datathat we queried the database or retrieved. How to make critical decisions by acquiringthe vast amounts of data, identify the structural relationship between them, manageand operate properly, which involves data mining.The purpose of data mining is to extract from large databases potentially usefulinformation and knowledge. If the entire database is a large search space, then themining algorithm is a search strategy. As the search space is generally very large, sothe search strategy must be as efficient as possible.The cluster analysis is one of the important tasks of data mining technology, andis one of its main research areas, it The internal relations in the identification data ofgreat importance, The main study of cluster analysis is how to divide the object intoseveral categories under the no training conditions. In the conventional clusteranalysis methods and data analysis of large databases, there is not only a hugeworkload, but also can’t guarantee the optimality of clustering results. The geneticalgorithm is an optimal solution search method by simulating the natural evolutionaryprocess, and its remarkable feature is the effective capabilities of using the globalinformation. Not only a small amount of results can reflect a larger area of theexploring space, which facilitates the real-time processing, but also it has strongrobustness to avoid falling into local optimum, so this paper attempts to use geneticalgorithms to solve the problems of dynamic determining the number of clusters incluster analysis and global optimization.The k-means algorithm is a classical algorithm in cluster analysis, but it is a localsearch technique that may prematurely converge to the optimal solution by the impact of the initial cluster centers, while the genetic algorithm has the ability of good globaloptimization, so combined genetic algorithm and k-means algorithm can solve wellthis problem. But with the traditional genetic algorithm for k-means algorithm globaloptimization, it is easy to produce premature convergence, the result may be anon-global optimal solution convergence, and efficiency in the late evolution ofsearch is low. So we introduce the principle of self-adaption and immune to optimizethe traditional genetic algorithm and solve these problems, make the algorithm moreefficient.
Keywords/Search Tags:Data Mining, Cluster, Genetic Algorithm, K-means
PDF Full Text Request
Related items