Font Size: a A A

A Semi-supervised Spectral Clustering Algorithm Research Based On Active Learning

Posted on:2016-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:B DongFull Text:PDF
GTID:2308330479486049Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, data-mining techniques have invoked the public’s wide concern and further researches. Moreover, the clustering technique, one of the most commonly used techniques in data-mining research, has showed up more and more in practice use. But traditional clustering algorithm tends to easily trap into low efficiency and local optimum when it treats that the shape of sample space does not present convexity, then spectral clustering emerges as the times require.Spectral clustering, based on spectral theory of graph partition, not only can achieve clustering in a non-convex shape of the distribution of the sample space, and also can avoid being trapped into local optimum and can converge to the global optimum,but in the division of boundary points is not clear. In order to get better clustering effect, this paper improves traditional spectral clustering. Semi-supervised learning can make use of the labeled data to supervised learning and can also make use of unlabeled data, semi supervised learning depends on the performance of the supervised information. This paper combines the semi-supervised with the spectral clustering techniques, adds a pair constraint information—Must-link and Cannot-link in spectral clustering,and guides the clustering process according to users’ needs. On this basis, I make use of an active learning strategy, adding a pair of constraint information by computing functions and selecting a specific boundary point in the data samples, which can improve the accuracy of clustering. Finally, according to the improvement of traditional spectral clustering, the paper proposes a semi-supervised spectral clustering algorithm method based on active learning, and introduces the details of the algorithm process.By using the SC-ALS,the traditional spectral clustering and the K-means algorithm, the paper makes clustering experiments on artificial data and UCI benchmark data respectively. Through contrasting clustering evaluation criteria of Accuracy, as the number of constraints reaches a certain point, semi-supervised spectral clustering that bases on active learning is obviously superior to the other two algorithms in terms of clustering accuracy rate, and can achieve a good clustering effect.
Keywords/Search Tags:spectral clustering, semi-supervised, active learning, pair constraint, SC-ALS
PDF Full Text Request
Related items