Font Size: a A A

Research On Robust Domain Adaptation Methods Under Open Dynamic Task Environments

Posted on:2024-06-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Y HanFull Text:PDF
GTID:1528306908993979Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the increasing development of machine learning technologies represented by statistical learning and deep learning,machine learning has played an essential role in various data analysis and mining scenarios,which has extensively promoted the intelligent process of application fields and generated huge economic and social benefits.Although significant progress has been made in both theoretical and practical research on machine learning,in practice,the key condition for its success is that the learner is in a closed,static task environment.One typical closed environment assumption is that the training set(source domain)and the test set(target domain)come from the same probability distribution,which is often too strict to hold in open dynamic task environments.How to analyze and solve the problem of data probability distribution shifts in open and dynamic task environments is one of the most challenging frontiers of modern machine learning.Domain adaptation relaxes the constraint that training data and test data must obey independent identical distributions in traditional machine learning,and has made significant progress in dealing with probability distribution shifts and knowledge transfer,thus improving the robustness and generalization of machine learning in open and dynamic task environments.Domain adaptation is also a fundamental approach to address the scarcity of labeled data for target tasks by setting the labels of target data is unavailable.Its research still faces many challenging problems,especially in domain-invariant feature representation learning.In the open and dynamic task environment,in addition to the assumption of independent identical distribution being easily broken with data stream,the assumptions of data purity,accessibility,and stationary are likely to be broken,thus successively forming open environments such as distribution shifts,data noise,privacy protection,and category shifts.Worse still,the latter three open dynamic environments are prone to interleave and overlap with data distribution shifts,leading to many complex and cross-cutting problems,thus challenging the robustness of unsupervised domain adaptation algorithms in compound open scenarios.First,when the source domain data occur label,feature,or open set noise,the robust domain adaptation problem under multiple noise environments is formed.Second,when multiple source domain data are inaccessible due to privacy protection,the multi-source-free domain adaptation problem is formed.Finally,when new class samples appear in the source or target domains and the class prior information is unavailable,the universal domain adaptation problem is formed.Therefore,how to effectively improve the robustness and generalization of unsupervised domain adaptation algorithms in compound open and dynamic task environments is an urgent and vital scientific problem.This thesis focuses on the challenges of insufficient domain-invariant feature representation learning,source domain noise,multi-source-free,and universal domain adaptation problems,analyzes the reasons,and designs targeted learning methods.The main research contents and innovations are as follows.1.To address the problem of insufficient learning of domain invariant feature representation,a transferable parameter learning method is proposed to actively update the transferable parameters(domain invariant parameters)in deep domain adaptation networks to enhance the learning of domain invariant feature representation and deactivate some of the non-transferable parameters to alleviate the overfitting of domain-specific information in the learning process,effectively balancing the dilemma of underfitting and overfitting in the source domain feature representation.The proposed method has a certain degree of generality.It can be embedded into a variety of deep domain adaptation networks to improve the generalization ability,providing a new perspective for learning domain-invariant feature representations.2.To address the problem of robust domain adaption in multiple noise environments,by analyzing the negative impact of multiple noises on domain adaption,the target domain generalization error bound in noisy environments is proposed,and it is demonstrated that the feature noise expands the difference between the source and target domain probability distributions.Based on the theoretical analysis results,a robust offline curriculum learning algorithm is proposed to optimize the energy-based conditional weighted empirical risk.The Gibbs energy estimator is integrated into the conditional weighted empirical risk to mitigate the negative impact of label noise and open-set noise on the target expectation risk.A proxy margin discrepancy is proposed by introducing a proxy distribution to provide an optimal optimization direction for minimizing the distribution discrepancy and theoretically analyzing the upper bound of the proxy margin discrepancy generalization error.A robust parameter learning is proposed to achieve robust parameter mining and forward optimization to mitigate the negative impact of noise further and enhance the learning of domain-invariant feature representations in noisy environments.3.To address the robust multi-source-free domain adaptation problem,taking the core problem of source model importance estimation as a point of penetration,a discriminability and transferability estimation framework is proposed to objectively and effectively quantify the posterior probability of source model importance to break the privacy protection and data silo problems.A proxy discriminability perception module is proposed,and two key model performance metrics,habitat uncertainty and habitat density,are designed to estimate the discriminability of source models objectively.A source similarity transferability perception module is proposed to learn the similarity of data distribution across domains.A domain diversity loss is also designed to ensure the rationality of transferability metric result assignment.The framework is orthogonal to existing source-free domain adaptation methods,and existing source-free domain adaptation methods can be incorporated into this framework as a backbone.4.To address the universal domain adaption problem,a semi-separated uncertainty adversarial learning is proposed to alleviate the pain point of setting sensitive thresholds in detecting private class samples with uncertainty metrics.The method achieves private class detection and weak distribution alignment without sensitive thresholds.A semiseparated uncertainty decision maker is proposed to learn the optimal threshold adaptively in a multi-level decision rule with uncertainty metrics as attributes.An uncertainty separation loss is proposed to learn explicit uncertainty bounds with large uncertainty margins to boost the uncertainty difference between public and private classes.A conditional weighted adversarial loss is also designed to adversarially and selectively match the feature distribution of public classes by using uncertainty bounds to avoid catastrophic distribution misalignment problems.The negative impact of class differences and distribution shifts is successfully mitigated by integrating the modules into an end-to-end optimization framework.
Keywords/Search Tags:Unsupervised domain adaptation, Open dynamic task environment, Domain invariant feature representation, Domain adaptation under noise environment, Source-free domain adaptation, Universal domain adaptation
PDF Full Text Request
Related items