Font Size: a A A

Transfer-based Image Adversarial Attacks

Posted on:2024-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:S Y FengFull Text:PDF
GTID:2568307079466034Subject:Electronic information
Abstract/Summary:PDF Full Text Request
The development of adversarial machine learning has increasingly highlighted the security issues of DNN: By adding specially designed perturbation(a.k.a.adversarial attack)on benign images that can be normally recognized by DNN,the resulting adversarial examples do not have an impact on human eye recognition results,but lead to erroneous output from DNN.Black-box attacks are the most realistic way to adversarial attacks,and utilizing the transferability of adversarial attacks to carry out black-box attacks is one of the most easily implemented black-box attack algorithms.The transferability of adversarial attacks means that adversarial examples generated from known white-box models(a.k.a.substitute models)can also pose a threat to other unknown black box models.The process of generating adversarial examples to attack victim models through the transferability of adversarial attacks is also known as black-box transfer-based attacks.But transfer-based adversarial attack is always limited by the overfitting of substitute model: the attack success rate on substitute model has reached nearly 100%,but on target models it can’t get a good result.Therefore,the thesis further improves the transferability of adversarial attack through the following two methods:Motivated by utilizing the similarity of regions of interest of different networks for a certain classification,the thesis proposes an Attack-guided CAM attack(AGCA).This method selects the adversarial examples generated by existing classic adversarial attack methods as the perturbation starting point,and projects existing adversarial examples’ Grad-CAM to new adversarial examples’ Grad-CAM and iteratively update new adversarial examples,ultimately achieving the goal of fine-tuning the original perturbation direction and improving the transferability of adversarial attacks.In order to extract features in DNNs for adversarial attacks more effectively,the thesis proposes a Gradient-Aggregated Attention Attack(GAAA).This method first diversifies the original image to imitate multiple models for processing original image,adds attention mechanisms to the substitute model to better extract features,and ultimately calculates an aggregate gradient to iteratively update the adversarial examples to achieve the goal of improving the transferability of adversarial attack.In Experiment,we have used attack success rate to prove the effectiveness of the above two methods.The methods proposed in the thesis will provide different ideas for subsequent research on improving the transferability of adversarial attacks,and will also help to evaluate the robustness of the models used in image classification tasks.
Keywords/Search Tags:Adversarial Examples, Black-box Transfer-based Attacks, Feature Extraction, Image Classification
PDF Full Text Request
Related items