An Improved Fast Density-based Clustering Algorithm For Mixed Data

Posted on:2019-05-27

Degree:Master

Type:Thesis

Country:China

Candidate:H Liang

Full Text:PDF

GTID:2348330569989335

Subject:Applied statistics

Abstract/Summary:

PDF Full Text Request

The fast density-based algorithm,proposed by Rodriguez and Laio in 2014(Al-gorithm RL),is widely used because of its superiority that the clusters are recognized regardless of their shape and that the number of clusters is determined intuitively.Considering datasets mixed by continuous and discrete variables,the distance mea-sure between the two data points is more complicated,few researchers have devote fast clustering algorithms to mixed data.Meanwhile most of the datasets in real life are mixed,so we propose an improved fast density-based clustering algorith-m for mixed data(Algorithm 2),this method is an improved implementation of"Clustering by fast search and find of density peaks"(Algorithm RL)to mixed data.Algorithm 2 defines the distance metric of the mixed datasets and selects the possi-ble cluster centers with the self-selection(algorithm 1),then each remaining points is assigned to the same cluster as its nearest neighbor of higher density.Because the complexity and time of the distance measure will increase at the square speed when the amount of data is large,in order to achieve the purpose of reducing the computational complexity and time,Algorithm 2 is proposed and explored,where the sliding window model is utilized,to the large mixed datasets clustering.The effectiveness of the algorithm is verified by sets of UCI data.

Keywords/Search Tags:

mixed data, fast density-based clustering, big data, window model

PDF Full Text Request

Related items

1	Researchs On Mixed Data Clustering Methods Based On Density Peaks And Dimensional Probability Model
2	Research On Density-based Subspace Clustering Algorithm For Data Streams
3	Research On Density-Based Subspace Clustering Algorithm For Data Streams
4	Research On Clustering Ensemble Of Mixed Data And Clustering Algorithm Of Mixed Data Streams
5	Algorithm For Clustering Data Streams Based On Density Units Covered
6	Study On Clustering For Large Data Sets And Its Applications
7	Research On Density Data Stream Clustering Algorithm Based On Sliding Window
8	Research On Data And Data Stream Clustering Algorithms For Mixed Attributes
9	Based On Sliding Window And The Grid Density Data Stream Clustering Algorithm Research
10	Research On Data Stream Clustering Algorithm Based On Density Grid Over Sliding Window