Font Size: a A A

Research On Adversarial Robustness Of Deep Learning Through The Lens Of Data Distribution

Posted on:2023-09-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y G ZhangFull Text:PDF
GTID:1528306902454494Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Deep neural networks(DNNs)have made remarkable achievements in many fields.With performance comparable to human performance,DNNs are beginning to be widely deployed in various systems.The wide application in production and life has brought new challenges to the development of DNNs.In some security-aware areas,DNNs are expected to be robust because wrong decisions may directly lead to the loss of life or property.Therefore,evaluating the robustness of DNNs has become an important filed.Built upon this motivation,the phenomenon of adversarial examples was discovered:DNNs show surprising vulnerability to subtle perturbations injected by adversaries.Specifically,the output of the model changes drastically when imperceptible adversarial perturbations are injected into the natural samples.Such extreme instability presents both exciting opportunities and great challenges for the development of DNNs.To deal with the adversarial example challenges,extensive and in-depth research has been carried out from multiple perspectives.In order to evaluate the model’s robustness,the research on adversarial attacks focuses on how to generate more powerful adversarial examples.Although there is a lot of exciting and contributive work in the field of adversarial attacks,existing methods suffer from either the flaw of overly strong assumptions or the poor performance under weak assumptions.To defend against adversarial attacks,adversarial defense focuses on improving the adversarial robustness of the model.Although great progress has been made in the field of adversarial defense under previous efforts,the performance of existing methods is still unsatisfactory.To uncover the differences between human cognitive systems and deep learning models,there has also been a great deal of valuable exploration in explaining why the adversarial example phenomenon exists.However,the existing hypotheses explaining the phenomenon of adversarial examples are often difficult to verify their validity and guide the development of adversarial attacks and adversarial defenses.Built upon the previous work,this thesis conducts an in-depth exploration of the above three aspects of adversarial examples from a data distribution perspective.This thesis proposes the adversarial region hypothesis and provides an explanation for the existence of the adversarial example phenomenon from the perspective of causal reasoning.The insight of the adversarial region is that the subspace where the adversarial perturbation is located is the fastest direction out of the tangent space of the local data manifold,so all samples in this subspace(the adversarial region)have potential threats to DNNs.Causal inference can fully exploit the properties of data distributions by modeling the generation process of data,so it can be used to explore the properties of a given data distribution.The contributions of this paper can be roughly summarized as follows:·The adversarial region hypothesis proposed in this thesis provides a new perspective for understanding the phenomenon of adversarial examples,expanding existing hypotheses.Based on the adversarial region,this thesis proposes the principal component adversarial example,which is the first adversarial sample generation method based on limited information of target models.This novel adversarial example generation method verifies the rationality of the adversarial region hypothesis.·To address the problem of the low efficiency of existing black-box attacks,this thesis proposes using the optimal distribution to improve the efficiency of blackbox attacks,inspired by the adversarial region hypothesis.This thesis proves the existence of an optimal distribution under a black-box attack,which captures the effectiveness improvement of black-box attacks as the problem of the approximation of optimal distributions.This provides a new direction for the development of black-box attacks.To approximate the optimal distribution,this thesis proposes using a learnable distribution for black-box attacks,under the assumption that the training data of the target model can be accessed.Moreover,the thesis further explores the problem of optimal distribution approximation when this assumption does not hold.In order to approximate the optimal distribution in black-box attack,this thesis defines and solves an efficient black-box attack problem and then proposes a dual-path distillation framework.This framework is able to complete the approximation of the optimal distribution without any additional information and only through a sufficient utilization of the feedback knowledge.Extensive experiments show that the framework can greatly improve the performance of black-box attacks.·To further understand and mitigate the adversarial vulnerability,this thesis establishes the connection between causal inference and adversarial examples.From the perspective of causal inference,this thesis shows that the origin of the adversarial vulnerability is that:the model’s over-focus on spurious correlations between non-semantic information and labels leads to the existence of adversarial examples.The thesis further demonstrates that the difference between the natural and the adversarial distributions results from the conditional association between non-semantic variables and labels.In order to eliminate the difference between these two distributions and improve the adversarial robustness,this thesis proposes a causal-inspired adversarial distribution alignment method to mitigate the adversarial vulnerability.Extensive experiments verify the effectiveness of the proposed method.This exploration provides a new idea for understanding the adversarial vulnerability from a causal viewpoint.
Keywords/Search Tags:Deep Learning, Adversarial Example, Adversarial Attack, Adversarial Defense, Causal Inference
PDF Full Text Request
Related items