Research On Key Techniques Of Text Adversarial Attack For Deep Learning

Posted on:2023-01-14

Degree:Master

Type:Thesis

Country:China

Candidate:Y Zhang

Full Text:PDF

GTID:2568307169983469

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Natural language processing is an important step for artificial intelligence to move from perception to cognition,and it has important application value in both military and civilian fields.Deep learning technology plays an increasingly important role in the field of natural language processing due to its powerful self-learning ability and data processing ability.However,despite deep learning-based natural language processing models show excellent performance,there are vulnerable to maliciously crafted adversarial examples.This brings a serious test to the actual deployment of the realistic scene of the natural language processing model.In order to further study the vulnerability and safety blind spots of natural language processing models based on deep neural network.The researchers conducted in-depth research and analysis on the reasons and construction methods of adversarial examples from the theoretical and technical levels,and proposed a variety of adversarial attack methods to evaluate and enhance natural language processing models.However,the existing research on text adversarial attacks mainly has the following two problems: First,the attack cost of the existing attack methods is too high,especially in the black box condition,the attacker cannot access the information such as the internal structure and parameters of the target model,the generation of adversarial samples can only be guided by manipulating the input and output of the target model,which requires a large number of queries on the target model.Therefore,to find the required number of adversarial examples is often costly.The second is that the adversarial disturbances generated by the existing attack method are usually aimed at specific input samples,and can only find an adversarial disturbance by repeated iteration of a single sample,resulting in low efficiency.In order to solve the above problems,this dissertation studies the key technologies of text adversarial attacks from three aspects:First,the problem of excessive number of query the target models in the existing attack method has proposed an adversarial attack method under a limited number of queries.By using the information of the adversarial samples generated by the local model,we transfer part of the process of attacking the target model to the local model to complete in advance.On the premise of ensuring a high attack success rate,the query cost in the attack process is greatly reduced.Compared with existing black-box attacks,this method reduces the average query cost by more than 46%.Second,since most of the existing attack methods use a single input sample to find adversarial disturbances one by one,we design a general adversarial attack method based on discrete particle swarm optimization.By finding a universal trigger and adding it to any input sample,we can cause the classifier to make predictions wrong.This method is able to fool multiple classifiers with high success rate.Third,most of the existing universal adversarial attack methods have the problems of low quality and easy identification of the generated universal triggers.We propose a universal adversarial attack method based on BERT sampling.The method can effectively generate universal triggers,and the combination of word frequency,fluency,grammar and human evaluation has proved the naturalness of the method to generate universal triggers.

Keywords/Search Tags:

deep learning, natural language processing, text adversarial attack, adversarial examples, universal adversarial triggers

PDF Full Text Request

Related items

1	Research On Adversarial Examples For Natural Language Processing
2	Text Adversarial Examples Based On Word-Level Perturbation
3	Research And Implementation Of Text Adversarial Example Generation Method
4	Research On Multi-Granularity Adversarial Attack And Defense Scheme For Natural Language Understanding
5	Research On Practical Adversarial Examples Generation Based On Deep Learning
6	Research On Adversarial Examples For Chinese Text Classification Models
7	Research And Implementation Of Deep Learning Protection Technologies For Adversarial Examples
8	Adversarial Attack And Defense Methods For Text Classification
9	Research On The Robustness Of Deep Image Classification Models Based On Adversarial Examples
10	Research On Adversarial Attack And Defense Against Natural Language Processing System