Research On Cluster Validity Indices For Categorical Data Clustering

Posted on:2014-12-26

Degree:Master

Type:Thesis

Country:China

Candidate:M Zhang

Full Text:PDF

GTID:2268330401462540

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Clustering is a method of unsupervised Machine-learning technique. It can partition the data which are out of order into series of clusters. The members within it are similar to each other. So it could bring more conveniences to the following data processing. Clustering has been widely used in bioinformatics, psychology research, business analysis, and text processing. Although clustering is a mature technique,it has many problems which are needed to be solved.Clustering Validity is a critical step of clustering analysis, is also an important subject for Machine-learning. We can determine the clustering tendency and the number of clusters of the dataset. There are so many different kinds of clustering algorithms and data that there is not an index to deal with any kind of them. So we must know most of the indices to select some of them or put forward some new indices to cope with the questions that we are facing.Clustering validation indices are always for the data which are made of numerical attributes, but there are so many categorical data in practice that some of the traditional indices can not be used any more. So we have changed some of the indices to adapt to different kinds of problems.We have tested three different indices on four categorical attribute datasets, and analyze the results to find that our conclusion can be proved to be true and can meet our needs on the whole.

Keywords/Search Tags:

cluster validity, categorical attribute data, data mining

PDF Full Text Request

Related items

1	Research On Categorical Data Clustering Algorithms
2	The Research On Clustering Algorithm For Categorical Data Using Quantum Mechanics
3	Research On The Validity Of Clustering Analysis Methods Based On High-dimensional Data
4	Research On Clustering Based On Attribute Characteristics For Categorical And Binary Data
5	The Study Of Clustering Data With Categorical Attributes In Data Mining
6	Research On Cluster Boundary Detecting Technology For Categorical Data
7	Research Of Clustering Algorithms For Categorical Data
8	Novel Fuzzy Clustering Algorithm Based On Nature Inspired Computation
9	The Research Of Ant-Based Clustering Algorithm For Data Sets With Mixed Attribute
10	The Research On Clustering Algorithm For Mixed Numeric And Categorical Values Based Partitioning Methods