The Research Of Clustering Based On Rough Set Theory

Posted on:2008-03-15

Degree:Master

Type:Thesis

Country:China

Candidate:J B Chen

Full Text:PDF

GTID:2178360215996698

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The data mining technique is a combination of machine learning, database and Statistical theory. Data mining can seek interesting or valuable information within large, incomplete, noisy, rough, and random databases. Cluster analysis is an important research problem in the domain of data mining. The goal of clustering is to classify data set into such clusters that intra-cluster data are similar and inter-cluster data are dissimilar without any prior knowledge, which is very different from data classification. So clustering is also known as"unsupervised classification". Cluester analysis as a module in the system of data mining can be used not only as a separate technique to discover the information about data distribution, but also as the preprocessing of other data mining operations, therefore it is very meaningful to research how to improve the performance of clustering algorithms.Rough set theory(RST) is a mathematical tool used for dealing with vagueness and uncertainty which is introduced by Pawlak. in the early 1980s. Rough set theory is good at analyzing the facts hidden in the data without any additional knowledge about the data. Due to its particular advantages, rough set theory has been received more and more attentions from researchers and applied in a variety of areas in recent years. In the domain of data mining, rough set was only used for classification at the beginning, but the research about rough set has already expanded to any aspects of data mining today.This thesis introduces the conception and main methods of data mining at first, especially analyses and compares every kind of clustering algorithms detailedly, then a improved clustering algorithm based on hierarchical method is given. This thesis studies rough set theory carefully and develops an attribute reduction method based on algebraic operation. Because rough set is good at dealing with incomplete and uncertain information, we introduce it into clustering in order to improve traditional clustering methods, then the positive effect of this improved algorithm is proved by experiment. This thesis analyses the relationship between granularity and clustering at last, while the application of rough set in clustering is researched in the frame of granularity. A clustering algorithm based on granularity is presented, then the experiment is implemented on two data sets from UCI database. The result shows that compared whit the clustering algorithm without granularity conception, the clustering algorithm based on granularity has more precise classify. It proves that we use rough set theory in clustering in the frame of granularity can improve the cluster quality obviously.

Keywords/Search Tags:

Data Mining, Clustering, Rough Set, Attribute Reduction, granularity

PDF Full Text Request

Related items

1	The Research Of Granularity Computing Based On Rough Set In Data Mining
2	Research On Subspace Clustering Based On Attribute Reduction
3	Data Mining Research Of Vehicle Sales Based On Hash Quick Attribute Reduction Algorithm
4	Research On Attribute Reduction Of Rough Set
5	Research On Approaches Of Dynamic Attribute Reduction Based On Knowledge Granularity
6	Rough Set Data Mining Approach And Its Application Relative To Decision Problem
7	Algorithm For Attribute Reduction Of Concept Lattice And Rough Set-based Cluster Analysis To Explore
8	Based On Rough Set Attribute Reduction Algorithm Of Data Mining To Improve Research
9	Research On The Attribute Reduction Algorithm Based On Rough Set In Data Mining
10	Study On Methods Of Data Mining And Text Mining Based On Rough Set