Font Size: a A A

Research On Deep Unsupervised Visual Domain Adaptation

Posted on:2023-09-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:W X DengFull Text:PDF
GTID:1528307169477214Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The emergence of deep neural network has greatly promoted a new round of progress in the field of computer vision.Undoubtedly,these great achievements depend on largescale labeled training data sets.However,labeling samples are labor-intensive and timeconsuming.Also,there are many scenarios e.g.,medical or military fields,where labeling data is extremely expensive or even impossible due to privacy or other issues.In order to solve the scarcity of labeled data in some target tasks/domain,transfer learning is proposed,which aims to use the relevant labeled data sets(source domain)to learn a model and apply it to the unlabeled target data sets(target domain).However,due to the distribution discrepancy(domain shift)between the source domain and target domain,the prediction model trained on the source domain can not perform well on the target domain.Domain adaptation,a subfield of transfer learning,has emerged as one appealing paradigm to address such problems.The purpose of domain adaptation is to learn a model via the source domain,which has good generalization performance in different but related target domain.This paper focuses on the more challenging Unsupervised Domain Adaptation(UDA),where the algorithm has access to labeled samples from a source domain and unlabeled data from a target domain.In this paper,unsupervised deep domain adaptation algorithm is deeply studied from the perspectives of feature disentanglement,marginal distribution alignment and conditional distribution adaptation.The major contributions of the thesis are summarized as follows:(1)There are two problems in the existing unsupervised domain adaptation methods:1)ignoring the intrinsic information of the target domain;2)The overall information of the image is directly used for adaptation.To address these problems,we propose to disentangle the target domain.Based on this,the paper proposes a novel,yet elegant module called Deep Ladder-Suppression Network(DLSN)which is designed to better learn the crossdomain shared content by suppressing domain-specific variations.The proposed DLSN is an autoencoder with lateral connections from the encoder to the decoder.By this design,the domain-specific details,which are only necessary for reconstructing the unlabeled target data,are directly fed to the decoder to complete the reconstruction task,relieving the pressure of learning domain-specific variations at the later layers of the shared encoder.As a result,DLSN allows the shared encoder to focus on learning cross-domain shared content and ignores the domain-specific variations.Notably,the proposed DLSN can be used as a standard module to be integrated with various existing UDA frameworks to further boost performance.Without whistles and bells,extensive experimental results demonstrate the proposed DLSN can consistently and significantly improve the performance of various popular UDA frameworks.(2)The strategy of aligning the two domains in latent feature space via metric discrepancy or adversarial learning has achieved considerable progress.However,these existing approaches mainly focus on adapting the entire image and ignore the bottleneck that occurs when forced adaptation of uninformative domain-specific variations undermines the effectiveness of learned features.To address this problem,we propose to cooperatively disentangle the target domain and the source domain.Accordingly,the paper proposes a novel component called Informative Feature Disentanglement(IFD),which is designed to disentangle informative features from the uninformative domain-specific variations.The proposed IFD includes two main components:Variational Information Bottleneck(VIB)and Variational Ladder Autoencoder(VLAE)which are applied to conduct supervised disentanglement for the source domain and unsupervised disentanglement for the target domain,respectively.IFD is equipped with the adversarial network or the metric discrepancy model,respectively.Accordingly,the new network architectures,named IFD AN and IFDMN,enable informative feature refinement before the adaptation.Extensive experimental results demonstrate the effectiveness of the proposed IFD AN and IFDMN models for UDA.(3)Although remarkable breakthroughs have been achieved in learning transferable representation across domains,two bottlenecks remain to be further explored.First,many existing approaches focus primarily on the adaptation of the entire image,ignoring the limitation that not all features are transferable and informative for the object classification task.Second,the features of the two domains are typically aligned without considering the class labels;this can lead the resulting representations to be domain-invariant but nondiscriminative to the category.To overcome the two issues,we propose to combine the class-level features disentanglement and alignment for unsupervised domain adaptation.Based on this,the paper presents a novel Informative Class-Conditioned Feature Alignment(IC2FA)approach for UDA,which utilizes a twofold method:informative feature disentanglement and class-conditioned feature alignment,designed to address the above two challenges,respectively.More specifically,to surmount the first drawback,IC2FA cooperatively disentangles the two domains to obtain informative transferable features;here,Variational Information Bottleneck(VIB)is employed to encourage the learning of task-related semantic representations and suppress task-unrelated information.With regard to the second bottleneck,IC2FA optimizes a new metric,termed Conditional Sliced Wasserstein Distance(CSWD),which explicitly estimates the intra-class discrepancy and the inter-class margin.The intra-class and inter-class CSWDs are minimized and maximized,respectively,to yield the domain-invariant discriminative features.Extensive experimental results on three domain adaptation datasets confirm the superiority of IC2FA.(4)Many existing approaches typically learn a domain-invariant representation space by directly matching the marginal distributions of the two domains.However,they ignore exploring the underlying discriminative features of the target data and align the crossdomain discriminative features,which may lead to suboptimal performance.To tackle these two issues simultaneously,we propose to combine the mining of the discriminative features with the alignment of the class-level features.Accordingly,this paper presents a Joint Clustering and Discriminative Feature Alignment(JCDFA)approach for UDA,which is capable of naturally unifying the mining of discriminative features and the alignment of class-discriminative features into one single framework.Specifically,in order to mine the intrinsic discriminative information of the unlabeled target data,JCDFA jointly learns a shared encoding representation for two tasks:supervised classification of labeled source data,and discriminative clustering of unlabeled target data,where the classification of the source domain can guide the clustering learning of the target domain to locate the object category.For the cross-domain discriminative feature alignment,the paper proposes two new metrics:1)an extended supervised contrastive learning,i.e.,semisupervised contrastive learning 2)an extended Maximum Mean Discrepancy(MMD),i.e.,conditional MMD,explicitly minimizing the intra-class dispersion and maximizing the inter-class compactness.Experiments results demonstrate that our JCDFA can obtain remarkable margins over state-of-the-art domain adaptation methods.
Keywords/Search Tags:Domain Adaptation, Deep Neural Networks, Domain Shift, Transfer Learning, Feature Disentanglement, Variational Information Bottleneck, Cluster Learning, Metric Learning, Adversarial Learning, Contrastive Learning
PDF Full Text Request
Related items