Font Size: a A A

Generalization Analysis And Optimization Research For Deep Domain Adaptation Model With Long-Tail Distribution

Posted on:2024-07-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y C LiuFull Text:PDF
GTID:2568306944458524Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In real-world scenarios,deep learning tasks suffer from Long-tail distribution(LT)or Domain Discrepancy(DD)problems,and both of them may influence the generalization of deep learning models.Many recent works have proposed effective methods to solve the problems separately.However,few studies have focused on both problems simultaneously,and the combined long-tail distribution and domain discrepancy problem(LT-DD)has not been well addressed.According to the machine learning generalization error upper bound theory,the model’s confidence plays a key role in the upper bound of the generalization error.In this thesis,based on generalization theory,we analyse the factors influencing the generalization of the LT-DD problem,revealing how the long-tail distribution and domain discrepancy will influence the model’s confidence and thus the model’s generalization.And then we propose a design principle as a reference for deep models in the LT-DD scenario.Based on the proposed principle,this thesis proposes decoupling domain adaptation models for the LT-DD problem from two perspectives,reweighting and resampling,respectively.The main results of this thesis are as follows.1.Exploring the influence of generalization of long-tail distribution and domain discrepancy.This thesis solve the challenge when long-tailed distribution and domain discrepancy jointly exist by focusing on how LT and DD influence the model’s generalization,and proposes design principles based on upper bound error theory.The principles suggest that:firstly,a decoupling domain adaptation framework integrating traditional UDA models with decoupling method can learn compact,undistorted domain invariant features on LT-DD tasks;and secondly,the features learned by the model for each class should be compact and far from the classification boundary.This thesis provides a framework for dealing with the LT-DD problem and offers a more advanced approach for learning high-confidence features and classifiers simultaneously in the future.2.Proposing a pseudo-label-based decoupling domain adaptation model.Based on the proposed design principle,this thesis combines a self-learning framework with two-stage domain adaptation model to propose a new decoupling domain adaptation method(PLD-DA).To improve the classification confidence of the classifier,the method introduces pseudo-label information of the target domain and uses a pseudo-label based reweighting learning strategy to calibrate the classifier.Several experiments demonstrate that PLD-DA is a simple and effective framework to minimize domain adaptation loss and to obtain unbiased classifiers with higher confidence,achieving better generalization,especially for tail classes.3.Proposing GAN-based decoupling domain adaptation model(GAN-DDA).In this thesis,the shortcomings of the class rebalancing methods are addressed using GAN generation.With the generator module and iterative training,GAN-DDA achieves adapting while generating,further improving the confidence and minority class performance of the model compared to conventional decoupling methods,indicating that the using GAN generation as class rebalancing method is more effective than using reweighting and oversampling methods alone,as it avoids the distortion of the class distribution and brings a greater diversity of sample features to the minority class.When combined with PLD-DA,the GAN-PLD-DA model achieves optimal performance,indicating the effectiveness and superiority of the proposed method.
Keywords/Search Tags:Domain Adaptation, Long-tail Distribution, GAN
PDF Full Text Request
Related items