Font Size: a A A

The Research Of Semi-Supervised Adaptive Clustering Based On Active Data Selection

Posted on:2013-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:P WenFull Text:PDF
GTID:2248330371487130Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Semi-supervised clustering, which aims to significantly improve the clustering results using limited supervision, has inevitably become the research focus in data mining and machine learning in recent years. But the accuracy of existing semi-clustering algorithms is low when dealing with the multi-density and imbalanced datasets or datasets with little labeled data. Based on active learning technology, this paper proposed two algorithms:one is called Label Data Selection Algorithm based on Active Data Selection and another is Adaptive Clustering Algorithm based on a Small amount of Label Data.The main idea of the first algorithm is as follows:select information-rich data as labeled data by combining the ideas of minimum spanning tree clustering and active learning, and then labeled by experts. Two of the most interesting characteristics are that (a) it combines the minimum spanning tree clustering and active learning to select label data;(b) it selects the point of maximum local density as the label data.In the second one, it uses the similar KNN technology to propagate labels according to the average density of cluster which contains the label data. Three of the features are that:(a) it propagates labels through the similar KNN idea;(b) it extends the maximum density points first;(c) it adapts to multi-density and unbalanced data sets due to update the extended threshold after each expansion.Evaluating on several UCI standard datasets and some synthetic ones, the results show that the proposed method has manifest higher accuracy and stable performance when processing the multi-density and unbalanced datasets.
Keywords/Search Tags:data mining, semi-supervised clustering, active learning, label data, dataselection, MST
PDF Full Text Request
Related items