| With the rapid development of deep learning and the improvement of hardware computing capability,deep neural networks(DNNs)have gradually become the standard technology to solve computer vision tasks.However,recent works find that DNNs are surprisingly vulnerable to adversarial examples,which are carefully crafted images by adding small perturbations on original images.Although imperceptible to the naked eyes,the small perturbations can make the most advanced DNNs misclassify.The existence of adversarial examples brings great obstacles to the practical applications of DNNs,especially in areas with high security sensitivity,such as automatic driving,security monitoring and sensitive content detection on the Internet.Research on image adversarial attack aims to propose more advanced adversarial attack methods to fool the DNNs,which can be exploited to evaluate model vulnerability and provide guidance to build more robust deep neural networks.Although a lot of recent works have made great process in adversarial attack,adversarial examples generated on existing adversarial attacks still show poor transferability to unknown black-box models.Besides,through defensive mechanisms,such as adversarial training,DNNs can learn more robust features and largely resist existing conventional adversarial attacks.In order to cope with above difficulties,this dissertation studies the adversarial attack technology in image classification.Details are shown as follows:(1)A discriminative adversarial attack method based on model attention is proposed.Most of existing feature attack methods perform indiscriminate attacks in feature space,ignoring the various effects of different regions on the classification results.Therefore,the first research approach of this dissertation focuses on how to exploit model attention to design an adversarial attack method that can measure the importance of different features.In details,we dynamically weight the high-level features with respect to the channel attention mechanism and disrupt them accordingly.And for low-level features that lack of semantics,high-level semantics are introduced as spatial attention guidance to make low-level perturbations concentrate on the most discriminative regions.The adversarial examples generated in this way can achieve better attack performance under the constraint of specific perturbation size.(2)A refined adversarial attack method based on erasing attention is proposed.Since the attention heatmaps may vary significantly across different models,overfitting will inevitably occur if only relying on the attention of the local proxy model during the adversarial attack.Therefore,the second research approach of this dissertation is to design a more universal refined adversarial attack method by introducing erasing attention mechanism.Specifically,we first obtain the attention heatmap of the highest feature layer and generate a soft attention mask to erase from the original image.Then,the erased image is input into the proxy model to extract new clues for classification.By several iterations,the proxy model is naturally driven to discover more discriminative regions that support prediction of the label class,and the obtained importance weights can be used to guide the adversarial attack.Through erasing attention,the attention gap between the proxy model and the target model is narrowed,thus the transferability of generated adversarial examples can be improved finally.(3)A transformation-invariant aggregated attack strategy is proposed.More robust defense models tend to extract effective classification information from the whole target object,which is caused by training under different data distribution or the transformation of images before they are input into the deep neural networks.Therefore,the third research approach focuses on performing integrated attack based on transformationinvariance to improve the attack performance of adversarial examples on the defense models.Specifically,we propose two integration strategies,namely average-based integration and generalization-based integration.During the attack,the transformationinvariant aggregated attack not only targets at a certain point,but integrates a group of images in the transformation space to jointly compute the optimization direction,thus the resultant adversarial examples become difficult to be defended.Based on the above researches,the effectiveness of the proposed methods is verified on the relevant dataset,and we further carry out ablation experiments and comparative analysis.Experimental results under a variety of settings show that the proposed methods can effectively improve the transferability of synthesized adversarial examples,and achieve better attack success rates on the defense models.Compared with existing methods,the proposed methods in this dissertation show obvious advantages. |