Research On Clustering Algorithm Based On Genetic Algorithm And Rough Set Theory

Posted on:2012-06-13

Degree:Master

Type:Thesis

Country:China

Candidate:L L Hong

Full Text:PDF

GTID:2218330368487438

Subject:Communication and Information System

Abstract/Summary:

With the rapid development of computer technology and database technology, a large amount of data has been produced in various fields, many important information hidden behind these data, people want to analyze them in order to extract useful knowledge. Thus, data mining was proposed. Data mining is one of the most forward lines of database and information decision area. Cluster analysis is an important branch of data mining, and its basic purpose is to discover the natural group characteristics of the data by analyzing the similarity between them.This paper discussed the clustering algorithm and its incremental algorithm, both of which based on the genentic algorithm and rough set theory, then discussed the clustering algorithm for clustering the categorical data. The main research of this paper is as follows:1,It analyzes the advantages and disadvantages of the existing rough K-means clustering algorithm, according to the genetic evolution of the genetic algorithm and the maximum minimum distance algorithm, proposes a optimized method of rough K-means, the algorithm can determine the initial center dynamic and non-random, while the boundary object can be dealt very well. Experimental results show the effectiveness and correctness of the algorithm.2,It analyzes the advantages and disadvantages of the existing non-incremental rough clustering algorithm, based on incremental thinking and neighbors thought, proposes an incremental clustering method. Experiments show that the algorithm can make full use of the previous mining results to improve the utilization of existing information and clustering efficiency, it also can deal with large data sets under dynamic environments.3,An efficient categorical data clustering method is proposed, it extends the K-means algorithm to categorical data domain to overcome the shortcomings of the traditional K-means algorithm which can only deal with numerical data. In accordance with the information of data distribution correlated to each value of each categorical attribute,and at the same time combined with the vertical and horizontal distribution of the data to measure the difference between data object and the class,it defines a new distance metric. Experiments show that this method can find the intrinsic relationship between the different values of the same attribute,and could measure the difference between objects effective.

Keywords/Search Tags:

data mining, K-means clustering algorithm, genetic algorithm, rough set, incremental, categorical data

Related items

1	The Research On Clustering Algorithm For Categorical Data Based-on Rough Set
2	Based On Rough Set Data Mining Method
3	Research On Application Of Rough Set Theory In Data Mining
4	Research Of Clustering Method In Data Mining Based On Genetic Algorithm
5	Research On Dynamic Clustering And Incremental In Data Mining
6	Research And Application Of K-means Algorithm In Data Mining Technology Based On Genetic Algorithm
7	Studies On Clustering Algorithms For Categorical Data
8	Optimized K-Means Clustering Analysis Based On Genetic Algorithm
9	Research Of K-Means Clustering In Data Mining Based On Genetic Algorithm
10	The Research On Clustering Algorithm For Categorical Data Using Quantum Mechanics