Font Size: a A A

Research On Global Fuzzy Clustering Algorithm

Posted on:2019-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:J Q TanFull Text:PDF
GTID:2428330542972972Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Clustering algorithm analysis is one of the most important branches in the machine learning field.With the development and deepening of the spatial clustering research,fuzzy clustering derived from which can objectively reflect the world.Fuzzy C-mean(FCM)algorithm is a fuzzy clustering algorithm based on partition.Its process is unsupervised,simple and easy to implement.It has gradually become one of the research hotspots in clustering analysis.However,the FCM algorithm relies heavily on the selection of initial clustering centers,and it is vulnerable to the influence of outliers and noise points.Clustering results are easy to fall into local optimum.Moreover,the optimal number of clusters is also uncertain.Therefore,the global fuzzy clustering algorithm has become an important topic.The main idea of the algorithm is the clustering process c clusters into a series of sub clustering process.The clustering result considering the distribution of the global data,jumped out of the local optimum algorithm,playing an important role in exploring the global spatial data process.Through the systematic analysis of the process principle of the global fuzzy clustering thought,we find that the existing algorithms still have the complicated formula,which leads to the computational burden more vulnerable to noise points and outliers,the initial cluster center is difficult to determine and the optimal number of clusters Unpredictable and other issues.Based on this series of shortcomings,this paper starts from the following two aspects to study and discuss the algorithm,making it more practical in the process of using.On the one hand,in view of the computational complexity,robustness and initial center of the existing global fuzzy clustering algorithm,this paper propose a fast global center fuzzy clustering algorithm based on a new metric(AM),firstly,the first initial center point is chosen according to the idea of concentration,that is,the cluster center is often located in the area with higher density.Therefore,in this paper,the idea of distance-k circle ratio(DKC)is proposed to find a densely distributed area of sample points,and the data points with larger DKC values are deleted from the candidate cluster centers to reduce the computational cost,meanwhile,the formula of DKC value is relatively simple,and further reduce the computational burden.Then,we introduce the AM measure instead of the Euclidean distance measure,and increase the robustness of the algorithm according to the monotone slowly increasing and bounded features of the AM measure,so as to reduce the impact of the isolated point on the clustering effect.Finally,based on the advantages of DKC and AM,a new self-defined function is proposed to determine the best initial center of next clustering.This function can quickly and accurately select a relatively dense distribution of surrounding samples.The sample points farther away from the cluster center serve as the next best initial center point,thereby avoiding the influence of the noise point and improving the clustering accuracy.On the other hand,aiming at the problem that the number of clusters is difficult to be determined,the effectiveness index of the existing fuzzy clustering is researched and improved.When we measure the results of clustering obtained by fuzzy clustering algorithm,we need to consider the fuzzy membership degree of each data point and the distance between the center points and so on,so we need to pay more attention to the overall distribution characteristics of data sets.Therefore,a new fuzzy clustering index is proposed in this paper,which combines the improved compact measure,separation measure and the partition coefficient.Among them,the compactness measure reflects the compactness of the data points in the class by calculating the class error;the separation measure reflects the degree of dispersion among the clusters by calculating the difference between every two fuzzy classes;the dividing coefficient is calculated by calculating Membership degree to reflect the clarity of clustering results.Obviously,when the compactness of data set is smaller,the degree of separation is bigger,and the partition coefficient is clearer,the clustering effect is better,and then the number of clusters to be clustered data is determined more accurately.Combined with the fuzzy clustering algorithm proposed in this paper,the real unsupervised nature is achieved.The experimental results show that the effectiveness index has good performance in both reliability and robustness.
Keywords/Search Tags:fuzzy clustering, global idea, density, measure, efficiency
PDF Full Text Request
Related items