Research On Deep Reinforcement Learning Based Text Adversarial Attack Method

Posted on:2022-01-12

Degree:Master

Type:Thesis

Country:China

Candidate:J W Li

Full Text:PDF

GTID:2518306572991269

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

In recent years,deep learning has achieved great success in various natural language processing tasks including the text classification,sentiment analysis,and machine translation.However,researchers have found that natural language processing models based on deep learning are vulnerable to attacks against texts.The adversarial text is generated by adding small disturbances that are usually not easily perceivable on the original text.These carefully designed texts can easily lead deep learning models to make wrong predictions,which has caused widespread concern in the academia and industry on the safety and integrity of existing deep learning algorithms.In addition,researches have shown that by generating high-quality adversarial texts and adding them to the training data for adversarial training,the robustness of the natural language processing model can be improved.Therefore,it is crucial to explore text adversarial attack methods,aiming to ensure the high robustness and reliability of deep learning models.In order to better dig out some potential security issues in natural language processing models and expand the scope of practical applications,we propose a novel text adversarial attack framework based on deep reinforcement learning,termed DQN-Attack.The proposed attack framework redefines the text black-box attack problem and abstracts it into Markov decision process.Inspired by the idea of hierarchical DQN(Deep Q-Network),the replacement operation is decomposed into two-step actions.With the utilization of the Q network training process in deep reinforcement learning,a complete hierarchical training algorithm is carefully designed to achieve the purpose of efficiently and reliably generating adversarial texts.In strict application scenarios,compared with existing text attack methods,DQNAttack can increase the attack success rate by 0.5%?54.5%,while the average number of queries to the target model is only 3.6%?21.2% of the baseline algorithms.Specifically,DQN-Attack has following advantages:(1)learnable-it has the ability to learn and can adjust attack strategies based on past experience;(2)adaptable-it outperforms state-of-theart attack methods under both settings of confidence-based and decision-based;(3)efficient-it only needs a very small number of queries to the target model to successfully attack;(4)scalable-it can combine three different reward types with each other to cope with different application scenarios;(5)transferable-it has more advantages in text transferability and the quality of generated adversarial texts is better than the current attack methods;(6)generalizable-it can attack most mainstream natural language processing models.

Keywords/Search Tags:

Deep Learning, Natural Language Processing, Reinforcement Learning, Adversarial Example, Black-box Attack

PDF Full Text Request

Related items

1	Research On Adversarial Attack And Defense Against Natural Language Processing System
2	Research On Textual Adversarial Attack Against Deep Learning Model
3	Research And Implementation Of Text Adversarial Example Generation Method
4	Design And Implementation Of Image Recognition Adversrial System Based On Black-Box Attack Technology
5	Enhancement Of Textual Adversarial Attack Ability Based On Metamorphic Testing
6	Research On The Adversarial Attack And Its Countermeasure Of Deep Reinforcement Learning
7	Research And Application On Adversarial Examples Under Security Of Deep Learning
8	Deep Reinforcement Learning in Natural Language Scenario
9	A Research On Attacks And Defenses Against Neural Networks
10	Modeling And Learning Of Representations For Natural Language Sentence-level Structures