Font Size: a A A

Adversarial Example Generation And Defense Strategy Based On Natural Language Understanding

Posted on:2024-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:P F QiuFull Text:PDF
GTID:2568307067993209Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the introduction of the Transformer architecture,deep learning models have gradually become crucial in the field of Natural Language Processing(NLP).For instance,the large-scale language model Chat GPT developed by the Open AI team has demonstrated outstanding performance in various NLP tasks.However,research has shown that adversarial examples,which were first discovered in computer vision,also exist in the field of NLP,posing significant security risks.Adversarial examples achieve incorrect outputs from DNN-based models by adding imperceptible,subtle perturbations to the original inputs.Nevertheless,the innate discreteness of textual data makes some gradient-based,effective adversarial attack methods in images not directly applicable to text.Although scholars in this field have proposed some high-quality adversarial attack methods,there is still room for improvement in balancing semantic consistency,grammatical correctness,and attack effectiveness.Additionally,this paper discovers that there are also areas for improvement in virtual adversarial training algorithms for adversarial defense.Building on extensive research on adversarial attacks and defenses in domestic and international contexts,we propose a novel word substitution attack algorithm for generating adversarial text that achieves improvements in accuracy and perturbation rate.Furthermore,we present an effective virtual adversarial training algorithm that enhances model robustness and generalization,which increased the score of the pretrained BERT model on the GLUE test server from 78.3 to 81.8.Specifically,the main contributions of this paper are as follows:1.In the direction of word replacement adversarial attacks,we propose a novel com-bination replacement strategy based on approximate Top-K.We treat the impor-tance of each word as a probability distribution and apply Monte Carlo sampling to obtain K replacement positions on this distribution,addressing the problem of approximate word importance in long texts.In addition,we extend the sequential replacement strategy to a combination replacement strategy,which combines the K sampled replacement positions and utilizes parallel computing to quickly find potential adversarial examples,greatly improving the algorithm’s efficiency.Fi-nally,we leverage part-of-speech information and employ a combination of syn-onym and language model strategies to generate more fluent adversarial text while ensuring semantic and grammatical correctness.2.We propose a virtual adversarial training algorithm based on a context dynamic perturbation vocabulary.In the initialization of the adversarial perturbation,we introduce a context dynamic vocabulary and perturbation table to record histori-cal perturbation information,and use it as the initialization value for word-level perturbation to reduce the noise effect caused by random initialization or static perturbation vocabulary.In addition,there are often some ineffective perturba-tion information with small gradients during the PGD iteration process,and we use an adaptive threshold to filter out these perturbations,reducing computation while improving model generalization.Finally,we comprehensively consider the importance between texts and words,and give higher perturbation values to im-portant texts and words,thereby improving the performance of virtual adversarial training.3.We conducted extensive experiments on datasets for various natural language pro-cessing tasks,such as text classification,textual entailment,and named entity recognition,to demonstrate the effectiveness of the proposed algorithm.We then performed ablation experiments to verify the importance of each module in the algorithm,and provided a detailed analysis and discussion of the experimental results.
Keywords/Search Tags:Natural Language Processing, Adversarial texts, Virtual Adversarial Training, Robustness
PDF Full Text Request
Related items