Font Size: a A A

Study Of Semi-supervised Fuzzy Clustering Algorithm Based On Pairwise Constraints

Posted on:2015-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y F ZhouFull Text:PDF
GTID:2298330422970016Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Semi-supervised clustering is an important method which can improve clusteringperformance by introducing partial supervised information; it also has been widely used inareas such as biology field, medical field and pattern recognition. Generally speaking there isa lot of metric function for semi-supervised clustering algorithm. For example: Euclideanmetric, Kernel metric and so on.Euclidean metric is the most commonly used metric function.But it also has disadvantages for Euclidean metric:1. It only has better effect on sphericaldata,2The cluster result is not good during processing large correlation data,3If the data ishigh dimensional, the computation may be very high, and it may cause the curse ofdimensionality. Specific to those problems, we propose a semi-supervised fuzzy clusteringalgorithm F-SCAPC which based on metric learning and pairwise constraints. The maincontents are al follows:First, Euclidean metric has better effect on spherical data,it is not suitable for ellipticaldata. In addition, the cluster result is not good during processing large correlation data. As weknow the Mahalanobis metric is good at handling the large correlation data, so in order tosolve these problems, we take Mahalanobis distance into the object function.Second, if the data is high dimensional, Euclidean metric may cause the curse ofdimensionality. But kernel function can generate a high dimensional feature space by mappingto solve this problem. We take Kernel metric into the algorithm. And do the experiments toverify the conclusion.This paper mainly studies the semi-supervised fuzzy clustering which introducesMahalanobis distance and Gaussian Kernel based on metric learning. And we obtain a newsemi-supervised fuzzy clustering objective function. By solving the optimization problem, wepropose a semi-supervised fuzzy clustering algorithm F-SCAPC which based on metriclearning and pairwise constraints. And we do experimental research for proposed algorithmF-SCAPC using the selected standard data set and the artificial data set. Besides, we compareperformance of presented algorithm F-SCAPC with FCM, CA, AFFC, KCA, KFCM-F andSCAPC algorithms. From the results, we can see that F-SCAPC is effective in theconvergence speed and the clustering accuracy.
Keywords/Search Tags:Semi-supervised Clustering, Pairwise Constraints, Mahalanobis Metric, Gaussian Kernel Function
PDF Full Text Request
Related items