Recently,deep learning models have achieved unprecedented success in artificial intelligence,such as computer vision and natural language processing.However,recent research shows that they are very vulnerable to the impact of adversarial examples.Adding well-designed perturbations to the examples is enough to deceive deep learning models,bringing new challenges to the deployment and application of artificial intelligence.The research hotspots of adversarial examples include adversarial attack and adversarial defense.The former provides a method to evaluate the security of models,and the latter can improve the robustness of models.At present,adversarial attack algorithms have achieved excellent performance in the white-box settings,but the transferability of attack algorithms in the black-box settings still has a lot of improved space.Most of the existing adversarial defense algorithms have a specific scope of applicable and lack general robustness.Given these shortcomings,this paper proposes two algorithms to enhance the transferability of adversarial examples and two adversarial defense strategies.(1)From the perspective of reducing the overfitting of adversarial examples to alternative models,this paper proposes a generative adversarial network architecture based on feature ensemble to generate adversarial examples with high transferability.Integrating different feature manifolds of input examples reduces the overfitting of adversarial examples to the specific model structure.Meanwhile,using the confrontation game idea of the generative adversarial network,the training generator captures the essential adversarial information of the example from the feature ensemble space and maps them into adversarial examples.Experiments show that the algorithm can improve the transferability of adversarial examples in the black-box settings and has some attack performance on the defense model.At the same time,the generated adversarial examples also retain the distribution characteristics of the original samples.(2)From the perspective of enhancing the generalization of alternative models,this paper proposes an adversarial example generation algorithm based on knowledge distillation to improve the transferability of adversarial samples.Combining the fusion prediction probability of the multi-teacher model and the adversarial examples generated by the alternative model,guide the alternative model to learn the strongly generalized decision boundary,the alternative model is enough to imitate a large class of unknown models.Then,combined with the existing gradient attack algorithm,the common blind spots of this kind of model are captured from the alternative model and generate the adversarial examples with high transferability for this defect.Experiments show that the algorithm improves the transfer attacks success rate of adversarial examples against unknown models and has high transferability on unknown models with defense strategies.(3)From the perspective of adversarial detection and modifying the model training process,this paper proposes an adversarial detection algorithm based on data distribution and a defense model training algorithm based on meta-learning to defend against the attack of adversarial examples.Firstly,training an encoder network to mine the inherent distribution difference between clean examples and adversarial examples in the embedded space and use this difference to detect adversarial examples.Then,introducing meta-learning to guide the defense model to learn richer data representation through different subtasks are constructed by randomly combining adversarial attack and auxiliary model,in turn,improve the robustness of the deep learning model.Experiments show that the proposed defense strategies can enhance the robustness of models against various attacks. |