Font Size: a A A

Multi-Stage Noise-Resistant Unsupervised Domain Adaptation Method For Causality

Posted on:2024-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y F ZhouFull Text:PDF
GTID:2568307103474624Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Relationship extraction is one of the basic tasks in the field of natural language processing.It has great research significance,application value and development prospects.As an important branch of the task,causality recognition is an important support for the application of knowledge mapping construction,text semantic analysis,text structure and so on.In the task of causal relationship identification,there are difficulties in labeling and small-scale datasets in professional fields,such as financial,biological,legal and other professional fields,which are difficult to form large-scale datasets.Unsupervised domain adaptation methods can make use of existing common domain causal datasets to improve the ability of causal relationship identification in professional fields.In addition,due to the complexity and diversity of causality,the model’s ability to map causality is limited,resulting in problems such as fuzzy classification boundaries and large domain differences in unsupervised domain-adapted causality identification tasks,which are difficult to solve by existing methods.This paper first presents a multi-stage unsupervised domain adaptation method for causal identification tasks to explore the internal characteristics of causal relationships.To solve the source domain label noise problem,an anti-noise method combining pre-training for causal recognition task is created.The main contributions of this article are as follows:(1)Through the analysis at the theoretical and feature level,the fact that there are multiple subclasses of substantial significance within the causal relationship is discovered.Based on the internal characteristic of the causality,feature level data augmentation is implemented to introduce consistency.The corresponding loss of consistency enables the model to actively unsupervised learn causal knowledge in the target domain without relying on labeling.It helps model to obtain a more complete feature space and a clearer classification boundary.(2)In order to make full use of the internal characteristics of causality,a multistage unsupervised domain adaptation method for causal identification task is proposed for the first time.It consists of source domain learning phase,antagonistic migration phase and consistency adjustment phase.In the source domain learning stage,comparative learning is used to obtain clearer classification boundaries while preserving the diversity of causal relationships.In the antagonistic migration phase,feature alignment is performed using antagonistic learning and knowledge distillation.In the phase of consistency adjustment,active learning of the target domain is achieved using multilevel combined filtering,feature level data enhancement,and consistency.This method increases the F1 value by an average of 4.3 in sentence-level causality recognition experiments and 6.6 in event-level causality recognition experiments.(3)To solve the label noise,an anti-noise method combining pre-training for causal recognition task is created.According to the knowledge and logic requirements of causality task,five secondary pre-training methods for causality are designed,and a dual-model pre-training method for multi-dataset and multi-task is proposed.The two-model anti-noise method based on consistency filters the noise data using the difference of style information,and the corresponding enhancement method helps the model maintain the difference.Finally,the method uses semisupervised loss and unsupervised ancillary tasks to overcome the negative impact of noise.This method improves the F1 value by an average of 1.8 on supervised tasks,which can effectively prevent the model from failing in unsupervised domain adaptive tasks with high noise rate.The internal characteristics of causality provide a new and feasible way of thinking for its unsupervised domain adaptation task,which can rely on the introduced consistency enhancement model to learn on unsupervised samples.In the noise learning research of causality,this paper presents a reliable anti-noise method from the perspective of model pre-training and feature style information,which further improves the robustness of causality recognition model.
Keywords/Search Tags:Causal relationship, unsupervised domain adaptation, label noise learning, pretraining method, data augmentation, consistency
PDF Full Text Request
Related items