Font Size: a A A

Research On Deep Reinforcement Learning For Robustness And Security

Posted on:2022-11-10Degree:MasterType:Thesis
Country:ChinaCandidate:M R YuFull Text:PDF
GTID:2518306776992919Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Reinforcement learning learns policies by interacting with the environment in a trialand-error manner.In recent years,deep reinforcement learning has attracted widespread attention due to its outperforming human performance in games.Although deep reinforcement learning has achieved excellent performance,there are unignorable security problems when it is applied in practice.Therefore,it is very necessary to build reliable deep reinforcement learning systems.This paper focuses on the robustness and security of deep reinforcement learning,both of which focus on the trustworthiness of the model itself.Here robustness focuses on the resistance of the model to natural noise,while security focuses on human perturbations,usually including adversarial attacks and adversarial defenses.Firstly,in terms of robustness,this paper considers how to deal with abnormal samples caused by natural noise.Due to the serialization and self-learning properties of deep reinforcement learning,this paper employs reinforcement learning to solve the task of time series anomaly detection.Specifically,modeling the time-series anomaly detection task as a reinforcement learning problem and expressing it as a Markov decision process,and then employing a policy-based reinforcement learning algorithm to learn a detection policy.Comparing with previous temporal anomaly detection models,our proposed policy-based time series anomaly detector not only achieves better anomaly detection performance on the same training and testing sets,but also performs better on different training and testing sets.Secondly,in terms of adversarial attacks,this paper considers how to conduct a blackbox attack against deep reinforcement learning to effectively evaluate the vulnerability.Since reinforcement learning is sequential and provides reward signals as feedback of attack in the black-box setting,this paper proposes an attack framework based on reinforcement learning.Furthermore,in order to generate semantically natural adversarial examples,we employ a generative adversarial network and three auxiliary losses to implement the reinforcement learning-based black-box attack.The experimental results on multiple Atari environments show that the black-box adversarial samples generated by this framework achieve stronger attack performance and are semantically natural.Finally,in terms of adversarial defenses,this paper considers reinforcement learning systems how to defend against perturbed rewards.Specifically,this paper regards the reward as a label in supervised learning,and uses the method of noisy label learning to construct a recovery reward model.The model takes the state-action pair as the input,takes the perturbed reward as the label,apopts the generalized cross entropy loss,and finally gets the reward after restoration as the output.Deep reinforcement learning uses recovered rewards to learn optimal policies.Qualitative and quantitative experimental results on multiple Atari environments show that the recovery strategy learned by the recovery model can obtain higher reward scores,that means,it provides stronger defense capabilities than other defense strategies.In summary,this paper aims to build a reliable deep reinforcement learning system based on robustness and security.Through the processing and the investigate of natural noise and human disturbance,deep reinforcement learning has stronger robust performance and security guarantee,thus It is more conducive to application in actual production and life.
Keywords/Search Tags:Deep Reinforcement Learning, Robustness, Security, Time Series Anomaly Detection, Adversarial Attacks, Adversarial Defenses
PDF Full Text Request
Related items