The Research On Fuzzy C-means Algorithm

Posted on:2011-08-20

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Cai

Full Text:PDF

GTID:2178330332956554

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the development of database technology and database management system used widely, many huge data is accumulated in organizations. In order to extract useful information and make better use of these resources, Data mining technology is proposed. Data mining combines the method of traditional data analysis with the complex algorithm to process mass data, and is a superior area in the information and database technology.As one of the main methods of data mining, clustering analysis partitions data set into the meaningful groups or clusters. In many of the cluster analysis algorithm, fuzzy clustering algorithm is the current research hotspot. This paper researches on the most classical fuzzy c-means algorithm(FCM),and proposes an improved algorithm based on the disadvantage of FCM. Experimental result illustrates its effectiveness and feasibility.This paper systematically analyzed FCM algorithm and basic principle of Mahalanobis distance, using the advantages of Mahalanobis distance to remedy the defects in the FCM algorithm, and using optimized KPCA to extract features. We improved FCM algorithm from the third aspects.First, FCM is based on Euclidean distance function, which can only be used to detect spherical structural clusters. When FCM processes some dataset of high correlation, error probability will be increased. Focusing on above two problems, this paper proposes an improved new algorithm called fuzzy c-means based on Mahalanobis distance function (FCM-M), and add a regulating factor of covariance matrix to each class in objective function. Using Mahalanobis distance, FCM-M algorithm effectively solves the shortcoming of FCM algorithm. There are efficient methods to solve singular values problem for finding Eigen_ value and eigenvectors of a symmetric matrix or computing pseudoinvertion involved in finding the Mahalanobis distance.Second, FCM regards the sample features have the same contribute to the cluster result; no thinking over the different features may have the different impact to the cluster result. When FCM processes some dataset of high correlation, error probability will be increased. Focusing on above two problems, this paper proposes an improved new fuzzy clustering algorithm based on feature weighted Mahalanobis distance function. Using adaptive Mahalanobis distance to weight the feature, the new algorithm can effectively cluster to the datasets of high correlation.Finally, kernel PCA method extracts feature from large samples and high dimension data sets, combining cultural algorithms(CA) to select optimized kernel function or near optimized kernel function. FCM based on the method not only effectively extracts the nonlinear information from the samples but also reduces dimension . The paper will accomplish the above-mentioned algorithms by MATLAB. Experimental results of data clustering of UCI and image segmentation illustrate the expected effect.

Keywords/Search Tags:

Fuzzy theory, Fuzzy c-means, Mahalanobis distances, Cultural Algorithms

PDF Full Text Request

Related items

1	Research On Partition-Based Online Clustering Algorithms
2	The Study Of Fuzzy C-means Algorithm Incorporating Spatial Information For Brain MR Image Segmentation
3	The Improved Fuzzy C-Means Clustering For Noisy Image Segmentation
4	Research Of Image Segmentation Based On Fuzzy Set Theory
5	Research Of New Fuzzy Clustering Algorithms Based On Objective Function And Its Applications
6	Pixel Based Image Segmentation Method Research Using Fuzzy Theory
7	Research Of Key Techniques In Fuzzy Clustering Based On Objective Function
8	Research On Robust Fuzzy Clustering Algorithm Based On Feature Selection
9	Research On Gall-stone CT Image Segmentation Based On Fuzzy Clustering Alogrithm
10	Research On Some Fuzzy C Means Clustering Algorithm And Its Application In IDS