Font Size: a A A

Research On Shortcut Learning Mitigation Techniques For Text Classification In Pre-trained Language Models

Posted on:2024-12-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:R SongFull Text:PDF
GTID:1528307340979479Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Pretrained Language Models(PLMs)represented by BERT,T5,GPT,etc.,have achieved state-of-the-art performance in a range of advanced Natural Language Understanding(NLU)tasks.However,they all face the issue of Shortcut Learning.Shortcuts are simple decision rules that perform well on standard benchmark tests but fail to generalize to more challenging test conditions.Recent research indicates that models relying on shortcuts exhibit poor generalization performance when applied to out-of-distribution(OOD)data.Furthermore,excessive reliance on shortcuts also makes PLMs vulnerable to various types of adversarial attacks.Text classification,as the most fundamental task in NLU,has attracted widespread attention from scholars regarding the issue of shortcut learning and how to mitigate it.Currently,existing methods can be categorized into two types:word-level and sample-level.Word-level research primarily leverages prior knowledge or machine learning interpretable methods to discover shortcuts or causal features which are opposite to shortcuts.Through techniques such as regularization,data augmentation,and contrastive learning,models are induced to pay more attention to robust features,thus improving generalization.Sample-level research focuses on the premise of shortcut agnosticism,meaning it does not assume the type of shortcuts.By extracting simple samples that may contain shortcuts,techniques such as re-weighting and knowledge distillation are employed to reduce the model’s attention to shortcut samples,thereby enhancing the reliability of PLMs across multiple tasks.However,there are still some research issues with the existing methods:(1)For word-level methods,current approaches rely on interpretable models to find causal words or shortcut words,but these interpretable methods often suffer from bias.Specifically,interpretable methods cannot guarantee the complete accuracy of the identified causal or shortcut words.Overemphasizing erroneous keywords can mislead PLMs to produce incorrect results.Additionally,causal features may be effective combinations of a series of words,and focusing solely on individual words may potentially negatively impact the model.(2)For sample-level methods,existing works often determine whether a sample is a simple sample based on prior knowledge such as expert knowledge,sample confidence,or sample convergence speed.These methods cannot guarantee the effectiveness of mining simple samples under the premise of shortcut agnosticism.(3)How shortcut learning in language models negatively affects model fairness and how to mitigate these negative effects to improve fairness still require further exploration.(4)With the rapid development of Large Language Models(LLMs),how to effectively measure the sensitivity of large models to shortcuts under limited samples,and how to mitigate the reliance of large models on shortcuts through prompt learning,lack in-depth investigation.To address the aforementioned key issues,the primary research contents in text are as follows:For word-level methods,a human feedback-driven reliable text classification framework is proposed to guide PLMs with human prior knowledge to achieve better performance on OOD tasks.Additionally,a causal word-group mining method is further proposed to induce model learning by searching for more reasonable combinations of causal features,thereby enhancing the robustness of PLMs.For sample-level methods,facing the challenge of shortcut agnosticism,a shortcut-agnostic feature disentanglement method is proposed to encourage PLMs to separate robust features beneficial for classification,thereby improving crossdomain generalization performance.Regarding the relationship between shortcuts and fairness,a thorough exploration of the impact of minority group-based shortcut learning on the fairness of PLMs is conducted.Various metrics are utilized for the quantification analysis of PLMs’ fairness.Furthermore,a bias mitigation framework for PLMs based on minority group demographic features is proposed to enhance the fairness of PLMs without compromising classification performance.Regarding the shortcut learning problem in LLMs,a shortcut evaluation benchmark is proposed,and the tendency of shortcut learning is verified on multiple LLMs.Subsequently,a prompt-based strategy for mitigating shortcuts in LLMs is proposed,effectively enhancing the robustness of LLMs in contextual learning without updating any parameters.
Keywords/Search Tags:Shortcut Learning, Shortcut Mitigation, Pre-trained Language Models, Cross-Domain Generalization, Robustness, Fairness
PDF Full Text Request
Related items