Fuzzy Clustering Analysis And Applications With Optimized Distance Metric

Posted on:2024-08-17

Degree:Master

Type:Thesis

Country:China

Candidate:X C Zhu

Full Text:PDF

GTID:2568307127999579

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

As a kind of unsupervised learning in machine learning,the fuzzy clustering algorithm can divide the unsigned data samples into several classes,and make the data in the same class as similar as possible,while the data between different classes as different as possible.The degree of similarity between samples is measured by distance metrics and different distance metrics can lead to different clustering results.At present,the distance metric used in fuzzy clustering algorithm is basically Euclidean distance,which often fails to produce good clustering results when dealing with different data features.Therefore,choosing a suitable distance metric can improve the performance of the fuzzy clustering algorithm and thus achieve accurate classification of the data.The main contents of this paper are as follows:（1）The fuzzy C-means clustering algorithm（FCM）is easily affected by noise data and outliers,so an improved FCM clustering algorithm（IFCM）is proposed by using the form of Euclidean distance function as a new distance metric.By clustering the X₁₂ dataset,the IFCM clustering algorithm can give the maximum fuzzy membership values to the class center points and more average fuzzy membership values to the other points in the class,thus verifying the performance of the algorithm has been improved in noisy environment.The IFCM clustering algorithms are compared and analyzed on the IRIS dataset,IRIS-2D dataset,Wine dataset,Apple dataset and Red jujube dataset from three aspects:clustering accuracies,clustering centers and iteration numbers.The results show that the IFCM clustering algorithm has the highest clustering accuracies of 92.67%,90.67%,81.46%,88%and 87.78%on the five datasets respectively and the final clustering centers generated by IFCM are closer to the real clustering centers.（2）Inspired by the IFCM algorithm,the distance metric in the form of the Euclidean distance function is applied to the fuzzy entropy clustering algorithm（FE）,and then an improved fuzzy entropy clustering algorithm（IFE）is proposed.Clustering on the X₁₂ dataset,the IFE can produce more accurate clustering centers.The clustering accuracies,clustering centers and iteration numbers of the IFE clustering algorithm are compared and analyzed on the IRIS dataset,IRIS-2D dataset,Red jujube dataset and Meat dataset.The results show that although the IFE algorithm has more iteration numbers,it has higher clustering accuracies of 92.67%,90.67%,88.33%and 93.33%on the four datasets and the final clustering centers generated on the IRIS dataset are closer to the real clustering centers.（3）Aiming at the problem of low clustering accuracies of the possibilistic fuzzy c-means clustering algorithm（PFCM）in processing non-hyperspherical datasets with heterogeneous density,a new distance metric is formed by normalizing the distance between the data points and the class centers and adding the distance variation of each data into the original Euclidean distance.An improved possibilistic fuzzy C-means clustering algorithm（PFCM-σ）is proposed based on the new distance metric and PFCM algorithm.The typical values generated by the PFCM-σalgorithm for the two noisy data points x₁₉ and x₂₀ in the X₂₀ dataset are much smaller than the normal data points,indicating that the PFCM-σalgorithm can accurately cluster the datasets containing noisy data.The clustering accuracies,clustering centers and iteration numbers are computed on the IRIS dataset,IRIS-3D dataset,Olive dataset and Meat dataset.The results show that although the PFCM-σalgorithm requires more iteration numbers,it has the highest clustering accuracies of 93.33%,92.67%,93.33%and 95.83%on the four datasets respectively and the final clustering centers are closer to the real clustering centers.

Keywords/Search Tags:

Machine learning, Fuzzy clustering, Euclidean distance, Distance metric, The similarity

PDF Full Text Request

Related items

1	Research On Topology Relation-based Distance Metric And Clustering Algorithms
2	Research On Distance Margin-based Deep Discriminative Clustering Methods
3	Research On Fuzzy Clustering Algorithm Based On Distance Metric
4	Research On Matrix-based 2D Distance Metric Learning And Spatial Euler Kernel With Applications
5	Research On Clustering Algorithm Of High Dimensional Data And Its Distance Metric
6	Metric Learning And Clustering Algorithm Based On Transitive Distance
7	Research On Low-Complexity Distance Metric Learning Algorithms
8	Semantic Distance Metric Learning And Its Application In Multimedia Content Analysis
9	Fuzzy System Modeling Based On Distance Combined Data
10	Research On Distance Metric Learning For Person Re-identification