Font Size: a A A

Studies On Semi-Supervised Clustering Algorithms Based On Pairwise Constraints

Posted on:2019-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:C M LiFull Text:PDF
GTID:2428330566483239Subject:Mathematics
Abstract/Summary:PDF Full Text Request
As an important learning method in machine learning,semi-supervised learning has attracted widespread attention of researchers.Among them,the study of semi-supervised clustering based on pairwise constraints is a research hotspot.The main research is to extend the traditional clustering method to semi-supervised learning based on pairwise constraints.Different from hard clustering methods,fuzzy clustering is a soft clustering technique that can output the degree of ownership of sample data for each cluster.For the problem of how to introduce a small amount of pairwise constraints information in the clustering process,since the degree of ownership of sample data for each cluster is quantitatively described by the membership,the pairwise constraints information can also be expressed by this,so the fuzzy clustering method is on this issue with research advantages.Although there are many important research results,many of the improved algorithms have problems such as complex structure,poor model interpretation and difficult to implement on the application.Moreover,not all of them can effectively solve the problems of constraint violation.Therefore,the study of semi-supervised clustering based on pairwise constraints is still an important research topic.In this paper,the objective function of the traditional fuzzy clustering algorithm does not include the expression of pairwise constraints information,so as to design an algorithm with a simple structure and can effectively use the pairwise constraints information to guide the clustering process for research purposes,and to solve key problems such as constraint violation at the same time.We study on the basis of the traditional maximum entropy clustering(MEC)algorithm.The main research work of this paper includes the following two aspects:1.In view of the fact that the MEC algorithm fails to use the pairwise constraints information to improve the clustering performance,this paper constructs the pairwise constraints information penalty term by introducing the concept of sample cross entropy,and combines the objective function to design the clustering algorithm.We propose a sample cross entropy semi-supervised clustering(SCE-sSC)algorithm based on pairwise constraints.The experimental results on multiple datasets show that the SCE-sSC algorithm can use the pairwise constraint information to guide clustering learning.As the number of pairwise constraints increases,the clustering performance is generally improved.2.We analyze the problem that the SCE-sSC algorithm is prone to the phenomenon ofmembership oscillation and sensitivity to initial values.To effectively solve the constraint violation problem,this paper redesigns the sample cross entropy.We introduce the concept of memory cross entropy and combine them to the objective function,at the same time rationally design the algorithm flow.We propose a memory cross entropy semi-supervised clustering(MCE-sSC)algorithm based on pairwise constraints.The results of dataset experiment and initial value sensitivity experiment show that the MCE-sSC algorithm can effectively use the pairwise constraints to improve the clustering performance,and it is not sensitive to the initial value and the algorithm can run stably.
Keywords/Search Tags:semi-supervised clustering, pairwise constraints, maximum entropy clustering algorithms, cross entropy
PDF Full Text Request
Related items