The Research On Knowledge-Driven Fuzzy Clustering Algorithm

Posted on:2011-02-26

Degree:Master

Type:Thesis

Country:China

Candidate:J Zhao

Full Text:PDF

GTID:2178360302999293

Subject:Operational Research and Cybernetics

Abstract/Summary:

Clustering is a broadly accepted synonym of fundamental endeavors aimed at finding patterns in data. In this study, we discuss an issue of exploiting some auxiliary hints being available as a part of domain knowledge and effectively incorporating them into the pattern recognition problem at hand.First of all, a new knowledge-driven clustering algorithm named Proximity Affinity Propagation (P-AP) is introduced. It makes use of the predefined criterion and the proximity hints given by users to modify the similarity matrix. This kind of strategy makes the clustering process more flexible to some specific problems because it involves the analyzer's knowledge.Secondly, a kind of Large Sample Clustering Algorithm (LSCA) is proposed for dealing with the problem that it is hard to get the prescribed number of clusters through the above algorithm and the problem of clustering a large sample data set. It can be regarded as the combination of Fuzzy C-Means (FCM) and Affinity Propagation (AP). There are two stages in this algorithm. At first stage, a distributed computing strategy is constructed by dividing the original data set into several data subsets, and then the exemplars (centroids) of each data subset are discovered with Affinity Propagation. At second stage of the algorithm, all the exemplars discovered at pervious stage are treated as the elements of one single set, and Fuzzy C-Means can be applied to them to produce some clusters, whose number is predefined by analyzer. At that moment, the samples which belong to any exemplar at first stage are arranged into the same cluster together with their exemplar. At this stage, fuzzy entropy as a kind of auxiliary tool is introduced for measure the reliability of fuzzy partition.Some experimental studies are researched for investigating the effectiveness of the proposed algorithms. For Proximity Affinity Propagation, The artificial data set which contains a few samples, the Iris data set and the Yale face data set are clustered with P-AP separately. For Large Sample Clustering Algorithm, the experiments on the Iris data set and the Shuttle data set are studied. Experimental results indicate that both of algorithms are easy and adaptable for evaluation, also have gained a good cluster analysis effect.

Keywords/Search Tags:

Fuzzy clustering, Proximity hints, Fuzzy C-Means (FCM), Affinity Propagation (AP), Large Sample Clustering

Related items

1	Hierarchical Clustering Algorithm For Mobile Wireless Sensor Networks Based On Affinity Propagation And Fuzzy C-means
2	The Application Of Fuzzy C-means Clustering In The Stock Investment
3	Research Of Key Techniques In Fuzzy Clustering Based On Objective Function
4	Research And Application Of New Methods In Symbolic Clustering
5	Fuzzy C-means And K-means Clustering Algorithm And Its Parallel
6	Research Of New Fuzzy Clustering Algorithms Based On Objective Function And Its Applications
7	Applications And Research On Possibilistic Fuzzy Kernel Clustering Algorithm Based On Sample-feature Weighted
8	Improved Fuzzy C Means Clustering Algorithm And Its Application
9	The Approach To Mining Time-lagged Coregulated Gene And Research On Fuzzy Clustering Algorithm
10	Study And Design On Texture Image Segmentation Algorithm Based On Clustering Analysis