Font Size: a A A

A Study Of Transfer-based Black-box Attacks

Posted on:2022-11-06Degree:MasterType:Thesis
Country:ChinaCandidate:Q S ZhangFull Text:PDF
GTID:2518306764476304Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Deep neural networks(DNNs)have achieved remarkable success in the image clas-sification task in recent years.Nonetheless,advances in the field of adversarial machine learning make DNNs no longer reliable.By adding a well-designed perturbation on a be-nign image(a.k.a adversarial attack),the resulting adversarial examples can easily fool state-of-the-art DNNs,which inevitably raises concerns about the stability of deployed models.Therefore,exposing as many“blind spots”of DNNs as possible is a top priority.Among all threat models,the black-box attack is the most challenging and practical since the deployed DNNs are usually transparent to unauthorized users for security and thus the adversary cannot base on the knowledge of the victim's model.Therefore,re-sorting to transferability of adversarial examples is a common practice.That is to say,the adversarial examples crafted via known white-box models(a.k.a.substitute model)are also dangerous for other unknown black-box models,which makes the black-box transfer-based attack possible.This thesis further enhances the transferability of adversarial ex-amples from three perspectives:1.Motivated by the nature of features extracted by DNNs,this thesis proposes a Patch-Wise iterative Fast Gradient Sign Method(PI-FGSM)–a black-box non-targeted attack towards mainstream normally trained and defense models,which differs from the existing attack methods manipulating pixel-wise noise.Specifically,this thesis introduces an amplification factor to the step size in each iteration,and one pixel'soverall gradient overflowing the?-constraint is properly assigned to its surrounding regions by a project kernel.2.Motivated by the weakness of sign method which inevitably leads to a deviation during updating,this thesis proposes a novel Staircase Sign Method(S~2M)to alle-viate this issue,thus boosting attacks.Technically,our method heuristically divides the gradient sign into several segments according to the values of the gradient units,and then assigns each segment with a staircase weight for better crafting adversarial perturbation.3.Motivated by the fact that recent transfer-based attacks are less practical(i.e.,as-sume the substitute model is trained in the same domain as the target model),this thesis builds a more practical black-box threat model to overcome this limitation.Specifically,with only the knowledge of the Image Net domain,this thesis proposes a Beyond Image Net Attack(BIA)to investigate the transferability towards black-box domains.This framework leverages a generative model to learn the adversarial function for disrupting low-level features of input images.Based on it,this the-sis further proposes two variants to narrow the gap between the source and target domains from the data and model perspectives,respectively.Extensive experiments demonstrate the effectiveness of our proposed methods.I hope these proposed approaches can serve as the baseline to help generating more transferable adversarial examples and evaluating the robustness of various models.
Keywords/Search Tags:Adversarial example, Patch-wise perturbation, Staircase sign method, Practical attack
PDF Full Text Request
Related items