Font Size: a A A

Research On Deep Reinforcement Learning Based Text Adversarial Attack Method

Posted on:2022-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:J W LiFull Text:PDF
GTID:2518306572991269Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years,deep learning has achieved great success in various natural language processing tasks including the text classification,sentiment analysis,and machine translation.However,researchers have found that natural language processing models based on deep learning are vulnerable to attacks against texts.The adversarial text is generated by adding small disturbances that are usually not easily perceivable on the original text.These carefully designed texts can easily lead deep learning models to make wrong predictions,which has caused widespread concern in the academia and industry on the safety and integrity of existing deep learning algorithms.In addition,researches have shown that by generating high-quality adversarial texts and adding them to the training data for adversarial training,the robustness of the natural language processing model can be improved.Therefore,it is crucial to explore text adversarial attack methods,aiming to ensure the high robustness and reliability of deep learning models.In order to better dig out some potential security issues in natural language processing models and expand the scope of practical applications,we propose a novel text adversarial attack framework based on deep reinforcement learning,termed DQN-Attack.The proposed attack framework redefines the text black-box attack problem and abstracts it into Markov decision process.Inspired by the idea of hierarchical DQN(Deep Q-Network),the replacement operation is decomposed into two-step actions.With the utilization of the Q network training process in deep reinforcement learning,a complete hierarchical training algorithm is carefully designed to achieve the purpose of efficiently and reliably generating adversarial texts.In strict application scenarios,compared with existing text attack methods,DQNAttack can increase the attack success rate by 0.5%?54.5%,while the average number of queries to the target model is only 3.6%?21.2% of the baseline algorithms.Specifically,DQN-Attack has following advantages:(1)learnable-it has the ability to learn and can adjust attack strategies based on past experience;(2)adaptable-it outperforms state-of-theart attack methods under both settings of confidence-based and decision-based;(3)efficient-it only needs a very small number of queries to the target model to successfully attack;(4)scalable-it can combine three different reward types with each other to cope with different application scenarios;(5)transferable-it has more advantages in text transferability and the quality of generated adversarial texts is better than the current attack methods;(6)generalizable-it can attack most mainstream natural language processing models.
Keywords/Search Tags:Deep Learning, Natural Language Processing, Reinforcement Learning, Adversarial Example, Black-box Attack
PDF Full Text Request
Related items