Font Size: a A A

Research On Fuzzy Clustering Analysis Algorithm Based On Density

Posted on:2013-10-06Degree:MasterType:Thesis
Country:ChinaCandidate:N N MaFull Text:PDF
GTID:2248330392954338Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Clustering analysis is an important branch of unsupervised pattern recognition and hasbeen widely applied in many fields as an important data mining tools. It is a process thatmakes similar datasets to a class as far as possible, and makes dissimilar datasets to differentclasses which have no category label, and is divided into several categories according tocertain standards. A lot of uncertainty and fuzziness problems exist in reality, so fuzzyclustering analysis is generated. The fuzzy clustering analysis has developed into a veryactive research area in clustering analysis, and has been applied in many fields successfully.Such as taxonomy, geology, financial industry, marketing, pattern recognition, imagesegmentation and many other fields. Therefore, fuzzy clustering analysis has a very broadresearch and application space.This paper studies the contents include two aspects as follows:(1) Based on the study of Fuzzy c-means clustering algorithm base on objectivefunction FCM (Fuzzy C-Means), problems which are based on Fuzzy c-means clusteringalgorithm in the selection of the initial cluster center has been improved. Fuzzy c-meansclustering algorithm has a strong dependence of the initial cluster centers, so quickly andaccurately to find the initial clustering center can get ideal clustering results. The traditionalfuzzy c-means clustering algorithm selected the initial clustering center randomly, so it gets arandom clustering result. This paper puts forward to the Gaussian density function incalculating the initial clustering center. The calculation method is: make DMax/C--themaximum distance of data point in data space as constraint conditions, select former c pointsfrom the maximum density as the initial clustering center then begin the FCM algorithmclustering. The experiments show that compared to randomly select initial cluster centers, theimproved algorithm calculated the initial cluster centers closer to the true cluster center.(2) The design and implementation of fuzzy c-means clustering algorithm based ondensity function weighted (DFCM): Consider the natural distribution of the datacharacteristics (Some data points around the data points more so the value of the pointdensity is big;on the contrary, some data points around the data points less so the value of thepoint density is small). Through calculating the Gaussian density function values for eachdata object and normalizing them as weights, adding it to the traditional Fuzzy C-Meansalgorithm, we can get the fuzzy c-means clustering algorithm based on density functionweighted. This can find the natural structure of the data set more reasonably and overcomethe shortcomings to determine the membership of different data point only according to the distance between them. DFCM algorithm is written by Java language, using simulation two-dimensional data sets, the IRIS data set of UCI data sets and high-dimensional data set ofwine data set to test the improved algorithm (DFCM). The experiments show that theimproved algorithm to determine data points to the membership change of cluster can bemore effective to reflect the distribution features of the data points. That data points inclusters of big density, data points more the membership degree is high; data points inclusters of big density, data points less the membership value is relatively small; data pointsin clusters of density is small, data points less the membership value more small. Therefore,by properly selecting membership degree threshold value can effectively distinguish theobjective existence of cluster and noise data points.
Keywords/Search Tags:Fuzzy clustering analysis, Fuzzy C-means clustering algorithm, Gaussiandensity function, Membership density weighted
PDF Full Text Request
Related items