Generalization Analysis And Optimization Research For Deep Domain Adaptation Model With Long-Tail Distribution

Posted on:2024-07-20

Degree:Master

Type:Thesis

Country:China

Candidate:Y C Liu

Full Text:PDF

GTID:2568306944458524

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

In real-world scenarios,deep learning tasks suffer from Long-tail distribution(LT)or Domain Discrepancy(DD)problems,and both of them may influence the generalization of deep learning models.Many recent works have proposed effective methods to solve the problems separately.However,few studies have focused on both problems simultaneously,and the combined long-tail distribution and domain discrepancy problem(LT-DD)has not been well addressed.According to the machine learning generalization error upper bound theory,the model’s confidence plays a key role in the upper bound of the generalization error.In this thesis,based on generalization theory,we analyse the factors influencing the generalization of the LT-DD problem,revealing how the long-tail distribution and domain discrepancy will influence the model’s confidence and thus the model’s generalization.And then we propose a design principle as a reference for deep models in the LT-DD scenario.Based on the proposed principle,this thesis proposes decoupling domain adaptation models for the LT-DD problem from two perspectives,reweighting and resampling,respectively.The main results of this thesis are as follows.1.Exploring the influence of generalization of long-tail distribution and domain discrepancy.This thesis solve the challenge when long-tailed distribution and domain discrepancy jointly exist by focusing on how LT and DD influence the model’s generalization,and proposes design principles based on upper bound error theory.The principles suggest that:firstly,a decoupling domain adaptation framework integrating traditional UDA models with decoupling method can learn compact,undistorted domain invariant features on LT-DD tasks;and secondly,the features learned by the model for each class should be compact and far from the classification boundary.This thesis provides a framework for dealing with the LT-DD problem and offers a more advanced approach for learning high-confidence features and classifiers simultaneously in the future.2.Proposing a pseudo-label-based decoupling domain adaptation model.Based on the proposed design principle,this thesis combines a self-learning framework with two-stage domain adaptation model to propose a new decoupling domain adaptation method(PLD-DA).To improve the classification confidence of the classifier,the method introduces pseudo-label information of the target domain and uses a pseudo-label based reweighting learning strategy to calibrate the classifier.Several experiments demonstrate that PLD-DA is a simple and effective framework to minimize domain adaptation loss and to obtain unbiased classifiers with higher confidence,achieving better generalization,especially for tail classes.3.Proposing GAN-based decoupling domain adaptation model(GAN-DDA).In this thesis,the shortcomings of the class rebalancing methods are addressed using GAN generation.With the generator module and iterative training,GAN-DDA achieves adapting while generating,further improving the confidence and minority class performance of the model compared to conventional decoupling methods,indicating that the using GAN generation as class rebalancing method is more effective than using reweighting and oversampling methods alone,as it avoids the distortion of the class distribution and brings a greater diversity of sample features to the minority class.When combined with PLD-DA,the GAN-PLD-DA model achieves optimal performance,indicating the effectiveness and superiority of the proposed method.

Keywords/Search Tags:

Domain Adaptation, Long-tail Distribution, GAN

PDF Full Text Request

Related items

1	Research On Image Recognition For Long-tail Distribution And Cross-domain Data
2	Research On Classification Algorithm Of Ancient Chinese Characters Based On "Long Tail Distribution"
3	Research On Target Detection Algorithm Based On Long Tail Distribution Data Set
4	Research And Application Of Multi-trick Hybrid On Long-tail Distribution Data In The Field Of Image Recognition
5	Research On Semantic Segmentation System Under Long-tail Data Distribution
6	Target Recognition Of SAR Image With Long-tail Data Distribution
7	Unsupervised Domain Adaptation Research Based On Domain Relation Utilization
8	Research On Heterogeneous Domain Adaptation Algorithm
9	Research On Object Detection Algorithm For Long Tailed Distribution Problem
10	Deep Feature Learning For Data Irregular Distribution