A Preliminary Study Of Clustering Method For Uncertain Data

Posted on:2021-05-21

Degree:Master

Type:Thesis

Country:China

Candidate:G Y Wang

Full Text:PDF

GTID:2428330620972191

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

At present,the data presents massive and diverse,which brings great difficulties to data mining and clustering analysis.In addition,there are uncertainties in the real data,which increase the difficulty of obtaining valuable information.Therefore,in recent years,how to extract valuable information from uncertain data sets has become one of the research focuses.Uncertain data is mainly divided into existence level and attribute level.In order to better understand these two kinds of uncertain data,this paper mainly does the following work:First,in the third chapter,this paper proposes the uldc uncertain data clustering algorithm.This algorithm is learning the density based clustering algorithm for uncertain data,and finds that some algorithms have some shortcomings when clustering uncertain data.In view of these shortcomings,a clustering algorithm based on local density for uldc uncertain data is proposed.Firstly,we improve the measurement of similarity between uncertain data objects,then introduce the related concepts of uldc algorithm,such as local density,data chain,etc.,and finally describe the overall process of the algorithm.Compared with DBSCAN and other algorithms,the algorithm reduces the number of parameter values and the influence of parameters on clustering results.The experimental results show that the F1 values of the algorithm are 0.8876 and 0.9086 on the iris data set connect-4,respectively,which shows that the algorithm has good clustering quality.Second,in Chapter 4,this paper proposes ubfcm uncertain data clustering algorithm,because in the real world,data objects are generally uncertain and the boundaries between data objects are fuzzy,so by improving the fuzzy c-means clustering algorithm,this paper proposes a ubfcm uncertain data clustering algorithm for uncertain data.Firstly,the principle of fuzzy c-means algorithm is explained in detail,which lays the foundation for this paper.Then the definition of uncertain data clustering model is explained.By using the centroid of uncertain data object to replace the original uncertain data object,the clustering algorithm can be simplified.Finally,a new similarity calculation method is used to calculate the similarity between uncertain data objects and improve the clustering quality.Compared with the UK means algorithm,the F1 values of the algorithm in Iris data set,wine data set and glass data set are 0.8965,0.7642 and 0.6248,respectively,which are higher than the F1 values of the UK means algorithm,indicating that the algorithm has certain correctness.

Keywords/Search Tags:

uncertain data, clustering, relative density, Fuzzy c-means

PDF Full Text Request

Related items

1	Research On Clustering Algorithm Of Uncertain Data
2	Research On Risk Degree-Based Safe Semi-Supervised Fuzzy Clustering Algorithm
3	Research On Dynamic Clustering And Incremental In Data Mining
4	Non-uniform Data Clustering Method Based On Relative Density
5	Research On Blocking Fuzzy Clustering Algorithm Based On Density Of Samples
6	Research On Support Vector Data Description Based On Relative Density Degree
7	The Application Of Improved Fuzzy C Means Clustering Algorithm In Image Segmentation
8	Study And Analysis On Clustering Algorithm In Data Mining
9	Research And Application Of Remote Sensing Image Clustering Based On The Improved Fuzzy C-means Algorithm
10	Research Of Clustering Algorithm Based On Relative Density