Semi Supervised Clustering Algorithm And Its Application And Research

Posted on:2009-09-24

Degree:Master

Type:Thesis

Country:China

Candidate:X Q Jiang

Full Text:PDF

GTID:2178360272957415

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Clustering technology is very important. Based on one metric (similarity metric, dissimilarity metric or distance), so called clustering is to divide set of individuals into some subset so that it is more similar between individuals in the same subset than in different subsets according to the certain criteria, the purpose of which is to mine the information from dataset. Semi Supervised clustering algorithm learns how to use a small amount of information to improve the clustering performance, which is widely used.The thesis firstly introduces general development of clustering and some technologies of the clustering. Specially, some introduce about metric learning, clustering method used common and value critic and so on, laying basic theoretical and experimental supports for the research in the following chapters. Against the previous Semi Supervised Fuzzy C-means Clustering algorithm, this paper carries out a detailed introduction and uses experiments to prove the algorithm.Secondly, in order to verify if this kind of Semi Supervised learning method can be used for other clustering algorithm, this paper improves the Maximum Entropy Clustering algorithm, uses Semi Supervised learning into the Maximum Entropy Clustering, generates Semi Supervised Maximum Entropy Clustering algorithm, and through experiments to prove by Semi Supervised learning Maximum Entropy Clustering algorithm can get the improvement and have real better result.For heaps-like, or data sets of large discrepancy of every class specimen number, to FCM algorithm and Semi Supervised Fuzzy C-means algorithm, their optimal solution may not be the right partition of the data, because these two algorithms have limitation of equal demarcation trend for data set. To resolve this problem, This thesis lastly use that distributing density size of the data dot is regard as weighted value, together with Semi Supervised learning introduced before, a Semi Supervised and dot density weighted Fuzzy C-means algorithm is proposed, and through experiments shows that the algorithm can improve the accuracy of the clustering.

Keywords/Search Tags:

Data Mining, Clustering Analysis, Fuzzy C-means Clustering, Maximum Entropy Clustering, Dot Density Weighted, Semi-supervised learning, Labeled Data, Metric Learning

PDF Full Text Request

Related items

1	Research On Risk Degree-Based Safe Semi-Supervised Fuzzy Clustering Algorithm
2	Semi-Supervised Clustering Analysis And Its Extended Research
3	Research And Improvement For Semi-supervised K-means Clustering Algorithm In Data Mining
4	Scmi-superviscd K-means Clustering Algorithm In Data Mining
5	Research On Clustering Algorithms Based On Metric Learning For Complex Data
6	Research On Semi-supervised Classification Algorithm Based On Clustering Ensemble
7	Study And Analysis On Clustering Algorithm In Data Mining
8	An Improved Semi Supervised Clustering Of Given Density And Its Application In Lithology Identification
9	Distributed Clustering And Evolutionary Clustering Algorithm Based On Semi-supervised Learning
10	A Study On Supervised(Transfer Leanring) Clustering For Large Scale Data