Font Size: a A A

A Research On Deep Clustering Model Based On Unsupervised Domain Adaptation

Posted on:2022-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z M YangFull Text:PDF
GTID:2518306764476734Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
Deep clustering methods apply the deep neural network to the clustering task and combine the clustering target with the feature extraction process together,which greatly improves the effect of the clustering method,especially on the high-dimensional data.However,traditional clustering algorithms rely on the clustered dataset itself and cannot use the external information.The unsupervised domain adaptation methods propose a new idea to deal with the bottleneck of existing deep clustering methods.By transferring knowledge from the source domain,the performance of deep clustering on the unlabeled target domain can be improved.Most of the current unsupervised domain adaptation algorithms have the following two shortcomings:(1)Most unsupervised domain adaptation methods perform target domain clustering while sample generating.The task of sample generation often requires a lot of computational costs,while the details of the generated data are often unimportant in clustering problems.(2)Most of the traditional methods are based on an assumption that there is a sufficient amount of labeled data in the source domain for efficient transfer.However,this assumption is not always satisfied in practice.First,there is no guarantee that there is enough labeled data in the source domain.Second,noisy data in the source domain may cause negative transfer phenomena.In order to solve problem(1),the thesis proposes an unsupervised domain adaptation algorithm at the feature level.The algorithm learns the knowledge that needs to be transferred from the source domain to the target domain at the feature level in an unsupervised manner,maps the source domain image features as target domain image features,and keep label information,these generated labeled features can be used for classification training of target domain features.To address problem(2),the thesis proposes a novel transfer learning framework that can improve deep clustering based on fully unsupervised domain adaptation.Specifically,in order to select reliable samples for transmission in the source domain,the thesis designs a new adaptive threshold algorithm to select low-entropy samples.Experiments prove that the method proposed in the thesis can effectively improve the effect of deep clustering.Furthermore,compared to the state-of-the-art methods using labeled data in the source domain,the method proposed in the thesis achieves competitive results without using any labeled data in the source domain.In the process of solving problem(2),the thesis finds that not all datasets can find a suitable source domain as a source,so the thesis proposes a self-transferring model that only relies on the data for clustering.In this model,for dataset that is difficult to find a suitable source domain,the low-entropy pseudo-label samples are used as the source domain to complete the mapping from high-entropy samples to low-entropy samples.The thesis applies it to experimental datasets and real natural data.All have achieved good clustering results.
Keywords/Search Tags:Deep Clustering, Unsupervised Domain Adaptation, GAN, Instances Selection, Transfer Learning
PDF Full Text Request
Related items