Research On The Technology Of Transferable Adversarial Attack Based On Deeplearning Feature Space

Posted on:2024-09-24

Degree:Master

Type:Thesis

Country:China

Candidate:S S Yang

Full Text:PDF

GTID:2568306941984619

Subject:Cyberspace security

Abstract/Summary:

PDF Full Text Request

Artificial intelligence technologies,represented by deep learning,have been widely used in the real world,and their security issues have also received increasing attention.Recent research has shown that deep learning models are susceptible to misclassification due to adversarial examples,making the evaluation of model robustness through adversarial attacks a hot topic in the field of deep learning security.Adversarial attacks based on transferability have received much attention due to their ability to generate adversarial examples based solely on local substitute models without any feedback information from the target model.However,existing methods commonly suffer from limited transfer attack capability on other target models due to overfitting to specific architectures or feature representations of the source model,as well as conflicts between the visual perceptual quality of adversarial examples and attack performance.This paper studies transferable adversarial example attack techniques based on the property that intermediate features in deep learning are more transferable,with the following main contributions:1.To address the insufficient transferability of existing adversarial attack algorithms,this paper explores the relationship between model generality and transferability and proposes the Intermediate-layer Attention-based Adversarial Attack(IAA)algorithm based on the deep learning model attention mechanism for the transferable adversarial attack.Applying model attention to guide the attack algorithm to concentrate on perturbing key features independent of the specific model structure.Extensive experimental results on the ImageNet validation set show that the proposed algorithm IAA can significantly improve the black-box transferability of the original attack algorithm while maintaining its whitebox attack performance.When VGG16 is used as the source model,IAA improves the transferability by an average of 7.25%over the state-of-theart feature adversarial attack algorithm FIA.2.To address the limited transferability improvement of existing model ensemble attacks,this paper investigates the positive impact of data augmentation and ensemble attack strategies on feature diversity and proposes a feature Diversity Ensemble Intermediate-layer Adversarial Attack(DEIAA)algorithm.By enhancing the diversity of input samples through data augmentation,the model is encouraged to extract key data features with differences between sub-layers.Based on the diversity of sublayer features,a multi-scale ensemble attack is conducted at multiple different depths of the model,further alleviating the problem of model overfitting by extracting integrated differentiated features.The experimental results on Inception_V4 show that the average transferability of the multi-layer ensemble strategy DEIAA is 82.86%,which is 7.08 percentage points higher than the single-layer attack IAA.Especially when facing adversarially trained defense models,the performance advantage of DEIAA is more pronounced,with an average transferability of 62.16%on the most advanced defense model,higher than other similar benchmarks.3.In response to the conflict between visual quality and attack performance of adversarial samples,this paper studies the working mechanism of spatial geometric perturbations in the sample space.Based on the research on the intermediate layer transferability of attacks,this paper attempts to transfer the effects of pixel position spatial transformations in the sample space to the intermediate layer features and proposes a spatial transformation-based intermediate layer transferable adversarial attack algorithm,stFA.Applying the intermediate-layerfeature-distance to constrain the geometric perturbation intensity.While promoting the model to misclassify the adversarial samples as the given target label,stFA encourages the intermediate layer feature expression of the original samples to move away from the original label.Extensive experiments show that stFA can effectively improve the black-box transferability of attacks while maintaining the advantage of perfect visual perception quality of geometric adversarial perturbations.For example,when VGG19 is used as the source model,stFA can improve the black-box transferability by an average of 9.50%compared to stAdv,while maintaining imperceptibility.

Keywords/Search Tags:

adversarial example, black-box transferability, deep learning, intermediate-layer features, model attention, spatial transformation

PDF Full Text Request

Related items

1	Adversarial Attack Promotion Algorithm Based On Transferability
2	Research On Enhancement Of Image Adversarial Samples Transferability In Deep Learning Based On Semantic Features
3	Research On Image Adversarial Attack Algorithms Based On Black-box Transferability
4	Research On Transferability Of Adversarial Samples In Deep Learning Models
5	Research On Attention-Guided Image Adversarial Attack
6	Research And Implementation Method For Image Adversarial Example Generation
7	Research On Enhancing The Migration Of Adversarial Samples Based On Frequency Domain Transformation And Attention Mechanism
8	Adversarial Attack And Defense Algorithms Based On Mid-and High-level Features Of Deep Models
9	Research On Black-box Attack Methods In Image Adversarial Examples Generation
10	Research On Robust Adversarial Attack Method In Three-dimensional Space