Font Size: a A A

Research On Density Clustering Algorithm Based On DBSCAN For Personalized Clustering

Posted on:2019-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z P GuoFull Text:PDF
GTID:2428330590950604Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the era of big data,data has great value.Machine learning has shown excellent results in data mining and has gradually become the main technology.Clustering algorithm is a significant technology.There are a wide range of applications in many scenarios,such as product recommendations,numerical predictions.However,in these scenarios,the range of values of data is very broad,and there are customized personalized services.In the data in commercial data,there are a large number of data that are gradually becoming sparse.Personalized clustering on such data sets not only requires the clustering algorithm to be suitable for non-uniform big data,but also needs to have diversified results and high homogeneity.But the traditional DBSCAN algorithm is stretched to these needs.Aiming at these needs,this paper presents a Constrained Extension and Adaptive Varied Algorithm Based on DBSCAN based on DBSCAN algorithm.In the face of the non-uniform density data with large and positive values,this paper adds the scale factor based on the theory of DBSCAN algorithm,and changes the calculation method of the neighborhood to realize the adaptive field setting of the algorithm.For the individualization of clustering results,narrowing the group to improve the homogeneity of the categories,and enhancing the diversity,after analyzing the class consolidation principle of the DBSCAN algorithm,the homogenous factor is added,and the homogenous factors are added before the category is merged.Judging,the controllability,high homogeneity and diversity of the clustering are realized.Finally,the performance of CEAV-DBSCAN is evaluated.This paper conducts clustering experiments in D31 dataset and R15 dataset,and makes application experiments on the real dataset of credit card users.By analyzing the experimental results,the CEAV-DBSCAN algorithm can achieve higher homogeneity and diversity than the DBSCAN algorithm when it is personalized clustering of non-uniform density data sets with large forward and sparse values.It's appropriate to solve the clustering problem where the data value is broadly positive and gradually sparse,and it's also appropriate to solve the clustering problem in the personalized service scene with diversity requirements.
Keywords/Search Tags:Density clustering, Controllable expansion, DBSCAN algorithm, Datasets with Varied, Personalized clustering
PDF Full Text Request
Related items