Font Size: a A A

Research On Distance Margin-based Deep Discriminative Clustering Methods

Posted on:2019-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ShaoFull Text:PDF
GTID:2428330611993386Subject:Engineering
Abstract/Summary:PDF Full Text Request
As a widely used unsupervised learning task,clustering task has always been a hot research issue.Traditional method have achieved good results,but they are insufficient while dealing with large-scale high-dimensional data.Inspired by the fact that deep learn-ing has achieved significant results in classification problems,clustering algorithms based on deep neural networks have been continuously proposed and achieved good results.However,Current deep clustering algorithms either can not achieve large inter-cluster dis-tance margin,or can carry out end-to-end training and inferring,which can not fully play the clustering effect that the deep clustering algorithm should have.For these problems above,this paper proposes two deep clustering algorithms based on distance margin con-straint:Margin-based Deep Discriminative Clustering Network and Cosine Margin-based Deep Discriminative Clustering Network.The experimental results show that the two methods can achieve good results in the clustering problem of large-scale high-dimensional data,and they are end-to-end.Based on the issue that existing deep clustering algorithms are difficult to learn the feature representation with large difference between clusters,this paper first pro-poses Margin-based Deep Discriminative Clustering Network(M-DDCN).This method improves the variability between clusters by increasing the Euclidean distance between the feature representations of different clusters.Specifically,if two data belong to differ-ent clusters,the Euclidean distance between their features is increased to be greater than a distance constraint factor.Furthermore,considering the unreliability of the intermediate clustering results,the intermediate probability is taken as a weighting factor to focus on data with high confidence.Aiming at the problem that the above algorithm relies on data mining strategy and increases the amount of computation,this paper proposes Cosine Margin-based Deep Dis-criminative Clustering Network(CM-DDCN).The method aims to increase intra-cluster compactness by limiting the value of the cosine angle between the same cluster data above a certain value.Specifically,if the intermediate prediction result determines that the sam-ple belongs to a certain cluster,the cosine between it and the parameter vector of this cluster is increased above a certain threshold.The algorithm enables the angle between the feature representation of the sample and its parameter vector to be smaller,resulting in a more compact feature representation of the samples of the same cluster.Similarly,considering the unreliability of the intermediate clustering results,we use the intermediate probability as a weighting factor,forcing the algorithm to pay more attention to data with high confidence.This method does not need to perform pairwise calculation on the data,which reduces the amount of calculation and the dependence on the data mining strategy.In this paper,the proposed two algorithms are compared with the existing clustering algorithms on multiple datasets and evaluated using various metrics such as ACC and NMI.Experiments show that both algorithms can achieve good results in the face of large-scale high-dimensional data.In addition,using the t-SNE visualization algorithm to reduce the learned feature representation to the 2-dimensional space,it is found that the distance between the clusters of the data is obvious,and the learned features are highly distinguishable.
Keywords/Search Tags:deep clustering, Euclidean distance metric, cosine distance metric, margin distance constraint
PDF Full Text Request
Related items