Adversarial Sample Detection Method For BERT Model Based On Sample Sensitivity Characteristics

Posted on:2023-11-22

Degree:Master

Type:Thesis

Country:China

Candidate:Y H Wang

Full Text:PDF

GTID:2568306788995129

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the development of deep learning technology in the field of natural language processing,new language models represented by the BERT model have been widely used.However,studies have shown that even high-performance language models are vulnerable to adversarial attacks.The text adversarial attack generates adversarial samples by slightly modifying the characters or words in the original sample sentence,so that the language model makes an error in the emotional judgment of the sentence,posing a threat to the security of the language processing system.At present,the research on text adversarial attacks against the BERT model is gradually increasing,but the research on defense against this type of attack is relatively rare.When adversarial samples attack BERT models with different parameter scales,the output distribution is significantly different from that of normal samples.Therefore,this thesis uses such difference phenomenon to propose an adversarial sample detection method to detect whether the current sample is an adversarial sample for the BERT model.The representative feature sensitivity indicators are extracted by generalizing the output distribution performance of the samples on the heterogeneous BERT model group.Use the Deep Word Bug,PWWS,and GAN to generate and screen high-quality adversarial sample sets.With the help of the training set composed of feature indicators,the adversarial sample classification detector based on SVM is trained.The adversarial sample detection algorithm is designed on the basis of the classification detector,and the system is constructed,so as to effectively detect the sentiment of the BERT model in the sentence.The purpose of this system is to mitigate the impact of adversarial examples on sentiment classification tasks.The experiment verifies the differential performance of normal samples and adversarial samples on the heterogeneous BERT model group on the SST dataset,and also verifies that the adversarial sample detection method based on the BERT model sample sensitivity feature design has a relatively good detection effect and can effectively Defense against text adversarial attacks against BERT models.

Keywords/Search Tags:

adversarial detection, adversarial samples, BERT model, sensitivity feature

PDF Full Text Request

Related items

1	Research On Adversarial Sample Detection Methods For Intelligent Image Recognition Networks
2	Research On Generating Transferable Adversarial Samples And Enhancing Adversarial Robustness Methods
3	Research On Attack And Defense Algorithms Of Adversarial Samples Based On GAN
4	Research On Adversarial Sample Generation And Defease Methods For Text Classification
5	Research On Adversarial Sample Generation Method For PE Malware Detection
6	Research Of Method On Generating Image Adversarial Samples Based On GAN
7	Research On Generation Technology Of Mongolian Handwritten Samples Based On Generative Adversarial Network
8	Research On The Nearest Neighbor Discrimination Method For Adversarial Sample Detection
9	Research On Adversarial Attacks And Defense Techniques For Single Stage YOLO Object Detection Model
10	Hypersphere Embedded Adversarial Training In Image Recognition