Font Size: a A A

A Density-based Clustering Algorithm Of Uncertain Data

Posted on:2016-11-19Degree:MasterType:Thesis
Country:ChinaCandidate:H P WangFull Text:PDF
GTID:2348330542476243Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Uncertain data processing technology and data mining technology has been widely applied in m any fields.In a field of combining the two,th e research on uncertain data clustering algorithm has become a m ajor focus of current research hotspot.Due to the uncertainty of the data clustering time is relatively short,some of the existing uncertain data clustering algorithms are mostly from the clustering algorithm based on determ inistic data with uncertain data c haracteristics change,rela tively mature uncertain data clustering algorithm is very rare.Therefore,with th e uncertain data contin ue to p roduce and development application.So,the research on un certainty of data clustering algorithm has become increasingly urgent.Based on the analysis and study of uncertaindata characteristics,uncertain data correlation processing technique,the density-based clustering al gorithm of certain data,the density-based clustering algorithm of uncertain data and other related theory and technology,the improved process of general method of the density-based clustering algorithm of uncertain data are put f orward.Based on the concept of the introdu ced probability rad ius and the information entropy combining with the m ethod of process,PRE-DBSCAN algorithm is proposed.Firstly,aiming at the lim it of?-neighborhood to existing algorithm s do not consider factors of uncertainty data its elf,resulting in unc ertainty data range of?-neighborhood are not accurate,it gives an uncer tain data objects defi ned according to the important degree,and puts forward the concept of probability radius with the characteristic of uncertain data objects,by means of probab ility radius on the u ncertainty range of neighborhood data object is lim ited and constrai nts,to improve the accuracy of object neighborhood.Secondly,for the problems of the constraints of existing algorithm core object is not precise enough,com bined with the charact eristics of uncertain data,introduced the concept of information entropy,the minimum information entropy MinEn and the minimum number MinPts of data points in the neighborhood are common to judge the core object,to reduce the uncertainty of core object.Fin ally,compared to the index technology of the existing PDBSCAN algorithm and FDBSCAN algorithm,the PRE-DBSCAN algorithm uses the~*R tree index technology for dealing with uncertain data to improve the efficiency of the algorithm,and given the description and the pseudo code of the new proposed PRE-DBSCAN algorithm.The uncertain data clustering was tested a ccording to the sim ulation experiment,and compared with th e existing FDBSCAN algor ithm and PDBSCAN algor ithm.The resu lts show that the proposed PRE-DBSCAN algorithm can be well applied to cluster uncertain data processing,and has higher clustering accuracy and algorithm efficiency,and has better ability in dealing with multidimensional data.
Keywords/Search Tags:Uncertain data, density-based, clustering, probability radius, information entropy
PDF Full Text Request
Related items