Font Size: a A A

The Research On Non-Spherical Clustering Algorithm Based On Grid Partition

Posted on:2019-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:L L ZhangFull Text:PDF
GTID:2348330542489043Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
According to different clustering modes,clustering algorithms can be divided into spherical clustering algorithms and non-spherical clustering algorithms.Non-spherical clustering algorithms refer to the automatic detection of arbitrary shaped clusters in data sets.The most representative non-spherical clustering algorithm is DBSCAN.DBSCAN can identify the cluster of any shape,but is very sensitive to the parameters.As spatial data can be pre-divided by grid,some scholars propose clustering analysis on the basis of grid.However,such methods are very strict on the requirements of grid partition,and different grid granularities will result in obviously different clustering results.When the data dimension is too high,the grid-based method is severely limited by the dimension-ality disaster problem.In order to solve the above problems,this paper proposes a non-spherical cluster-ing algorithm based on grid partition.Differing from the DBSCAN and the grid-based algorithms,the proposed algorithm is divided into three steps:adaptive grid generation,agglomerative clustering for high-density region and class identification for sparse area.The paper first gives an adaptive determination method for grid width,which can get an approximately optimal grid structure.After getting the grid structure,non-spherical clustering includes aggregation of high-density grids and mean shifting of low-density grids.This method can effectively determine the category of border area and can detect sparse clusters.In addition,the paper proposes a non-spherical clustering algorithm based on multi-subspace structure,which extends the clustering algorithm proposed in the paper from low-dimensional field to higher-dimensional field.The paper conducts numerical experiments on four synthetic datasets and four real data sets,and analyzes the parameters and performance of the algorithm.In order to il-lustrate the effectiveness of the algorithm,the paper conducts a lot of contrast experi-ments.The result proves that the proposed algorithm has some advantages in terms of running time,clustering accuracy and algorithm scalability.
Keywords/Search Tags:Clustering Analysis, Grid Partition, Non-Spherical Clustering, Multi-Subspace
PDF Full Text Request
Related items