Research And Application Of Several Clustering Algorithms For Mixed Attribute Data

Posted on:2020-10-08

Degree:Master

Type:Thesis

Country:China

Candidate:L L Xie

Full Text:PDF

GTID:2428330620950956

Subject:Mathematics

Abstract/Summary:

PDF Full Text Request

In data mining,how to extract our useful information is the research focus of scholars,and cluster analysis is one of the most important analysis methods.It has important research significance for data visualization.Due to the complexity of data Sexuality and diversity,clustering of mixed attribute data has become one of the hot issues in cluster analysis.In the clustering research of mixed attribute data,many existing clustering algorithms can get better clustering results,but they rely heavily on the initial value and the number of clusters.They need artificial selection parameters,which may cause aggregation.The class gets a bad result;And for the calculation of the distance between the mixed attribute data objects,the data is generally regarded as two parts,numerical type and subtype,and then the data of the same attribute is calculated,and the two are added and solved,which may Lead to the loss of some information;For data with complex shapes,some algorithms will get poor clustering results.For these problems,the following research is done.(1)For the problem that the K-means algorithm relies on the initial value and the number of clusters,the ACC algorithm is used to determine the initial value and the number of clusters to adjust the K-means algorithm.The experimental verification is performed on the UCI dataset.The ACC-K-means algorithm has higher accuracy and better stability.(2)For the problem that mixed data is a whole data,this paper uses Gower coefficient to process mixed attribute data.The K-prototype algorithm relies on the initial value and the number of clusters.This paper uses ACC algorithm and then based on the idea of limited coverage.The data is globally optimized to achieve better clustering results.Experiments show that the improved algorithm CBDO algorithm has better experimental results than K-means algorithm and K-prototype.(3)For the problem of dealing with complex shape data,this paper uses spectral clustering algorithm for clustering.Since the distance in the similarity matrix in spectral clustering is based on Euclidean distance,the information between data will be lost,so we adopt The manifold distance based on information entropy weighting.Experimental results show that the proposed algorithm has better clustering performance.

Keywords/Search Tags:

Cluster Analysis, Mixed attribute data processing, ACC algorithm, Spectral theory, Manifold distance, Spectral clustering

PDF Full Text Request

Related items

1	Research On Spectral Clustering Algorithm And Its Application
2	Research And Application Of Spectral Clustering Algorithm Based On Manifold Distance Kernel
3	Study On The Spectral Clustering Algorithm Based On Mixed Data
4	Improved Spectral Clustering Algorithm And Its Application Research
5	Research On Clustering Algorithm Based On Manifold Distance And Its Application In Aurora Classification
6	Research On Multiway P-spectral Clustering Algorithm Based On Self-adaptive Neighborhood
7	Spectral Learning And Clustering And Its Application
8	Research Of Hybrid Manifold Clustering Algorithms Based On Spectral Clustering
9	Design And Implementation Of Spectral Clustering Algorithm For Large Scale Data
10	Research On Improved Spectral Clustering Algorithm And Its Application In Logistics Distribution