Artificial intelligence technologies,represented by deep learning,have been widely used in the real world,and their security issues have also received increasing attention.Recent research has shown that deep learning models are susceptible to misclassification due to adversarial examples,making the evaluation of model robustness through adversarial attacks a hot topic in the field of deep learning security.Adversarial attacks based on transferability have received much attention due to their ability to generate adversarial examples based solely on local substitute models without any feedback information from the target model.However,existing methods commonly suffer from limited transfer attack capability on other target models due to overfitting to specific architectures or feature representations of the source model,as well as conflicts between the visual perceptual quality of adversarial examples and attack performance.This paper studies transferable adversarial example attack techniques based on the property that intermediate features in deep learning are more transferable,with the following main contributions:1.To address the insufficient transferability of existing adversarial attack algorithms,this paper explores the relationship between model generality and transferability and proposes the Intermediate-layer Attention-based Adversarial Attack(IAA)algorithm based on the deep learning model attention mechanism for the transferable adversarial attack.Applying model attention to guide the attack algorithm to concentrate on perturbing key features independent of the specific model structure.Extensive experimental results on the ImageNet validation set show that the proposed algorithm IAA can significantly improve the black-box transferability of the original attack algorithm while maintaining its whitebox attack performance.When VGG16 is used as the source model,IAA improves the transferability by an average of 7.25%over the state-of-theart feature adversarial attack algorithm FIA.2.To address the limited transferability improvement of existing model ensemble attacks,this paper investigates the positive impact of data augmentation and ensemble attack strategies on feature diversity and proposes a feature Diversity Ensemble Intermediate-layer Adversarial Attack(DEIAA)algorithm.By enhancing the diversity of input samples through data augmentation,the model is encouraged to extract key data features with differences between sub-layers.Based on the diversity of sublayer features,a multi-scale ensemble attack is conducted at multiple different depths of the model,further alleviating the problem of model overfitting by extracting integrated differentiated features.The experimental results on Inception_V4 show that the average transferability of the multi-layer ensemble strategy DEIAA is 82.86%,which is 7.08 percentage points higher than the single-layer attack IAA.Especially when facing adversarially trained defense models,the performance advantage of DEIAA is more pronounced,with an average transferability of 62.16%on the most advanced defense model,higher than other similar benchmarks.3.In response to the conflict between visual quality and attack performance of adversarial samples,this paper studies the working mechanism of spatial geometric perturbations in the sample space.Based on the research on the intermediate layer transferability of attacks,this paper attempts to transfer the effects of pixel position spatial transformations in the sample space to the intermediate layer features and proposes a spatial transformation-based intermediate layer transferable adversarial attack algorithm,stFA.Applying the intermediate-layerfeature-distance to constrain the geometric perturbation intensity.While promoting the model to misclassify the adversarial samples as the given target label,stFA encourages the intermediate layer feature expression of the original samples to move away from the original label.Extensive experiments show that stFA can effectively improve the black-box transferability of attacks while maintaining the advantage of perfect visual perception quality of geometric adversarial perturbations.For example,when VGG19 is used as the source model,stFA can improve the black-box transferability by an average of 9.50%compared to stAdv,while maintaining imperceptibility. |