Font Size: a A A

Research On Adversarial Training Methods Of Deep Neural Networks

Posted on:2022-12-10Degree:MasterType:Thesis
Country:ChinaCandidate:P F ZhangFull Text:PDF
GTID:2518306776992489Subject:Trade Economy
Abstract/Summary:PDF Full Text Request
In recent years,deep neural networks have achieved high accuracy in many classification tasks,including speech recognition,object detection,and image classification.Although the deep neural network is robust to random noise,adding some special perturbations to the neural network input that cannot be detected by the human eye can cause the deep neural network model to output wrong predictions.These samples with special perturbations are usually called adversarial samples.In order to improve the robustness of deep neural network,the method of deep neural network defense against adversarial samples is studied.There are three methods in adversarial defense: gradient masking,knowledge distillation,and adversarial training.This paper mainly studies the adversarial training method of deep neural network,proposes three adversarial training methods,and implements an experimental platform for adversarial sample generation and defense based on some existing attack algorithms and the proposed three adversarial training methods.The main work contents and innovations are as follows:First,for the defense method against adversarial samples,an adversarial training method based on feature distillation combined with metric learning is proposed.This method is to first pretrain a fixed teacher network,which is trained with clean samples.The student network uses adversarial samples for adversarial training.During the training process,the clean samples are used in the middle layer of the teacher network to guide the adversarial samples in the middle layer of the student network.At the same time,considering the relationship between the student network adversarial samples and the original clean samples,the metric learning loss is introduced,and the middle layer features of the student network make the distance between the adversarial samples and the original clean samples higher than that between the adversarial samples and the confusing class samples.The distance is closer,which makes the deep neural network model more robust.Finally,gray-box attack,white-box attack and black-box attack are carried out to verify the effectiveness of the method.The proposed algorithm significantly outperforms state-of-the-art adversarial training algorithms in adversarial attack experiments.Secondly,in order to further improve the robustness of the deep neural network model,two adversarial robust distillation algorithms improved by metric learning are proposed.The first approach is to combine adversarially robust distillation and metric learning in the output space.The adversarial robust distillation algorithm is to make the distribution of the adversarial samples of the student network as close as possible to the output of the clean samples of the robust teacher network.On the one hand,metric learning ensures the distance between the output of the clean samples of the robust teacher network and the output of the adversarial samples in the student network Closer,on the other hand,guarantees a greater distance between the clean sample output of the robust teacher network and the most confused class sample output of the student network.Because adversarial robust distillation makes different samples have different outputs,metric learning can further enhance the effect of adversarial robust distillation,and at the same time,it can identify the differences between different samples,making the model more robust.The second is because the adversarial robust distillation The robust teacher network has unstable classification accuracy for the adversarial samples generated by the student network during the training process,and there will be unreliable situations.It is necessary to consider the student network's own adversarial samples and original clean samples.Here,metric learning is chosen to constrain the features of the middle layer of the student network.The middle layer features of the student network make the distance between the adversarial samples and the original clean samples closer than the distance between the adversarial samples and the samples of the confused class,which makes the student network Considering the learning situation of the own network while learning the knowledge of the robust teacher network,a good regularization effect is achieved,and an adversarial robust distillation algorithm based on introspection regularization is proposed.Finally,experiments on white-box and black-box attacks are carried out to verify the effectiveness of the algorithm.The two proposed adversarial robust distillation algorithms outperform the adversarial robust distillation algorithm in white-box attacks,surpass the state-of-the-art adversarial training algorithms in blackbox attacks and are comparable to adversarial robust distillation algorithms.Finally,according to the existing adversarial methods and the proposed adversarial training method,an experimental platform for adversarial sample generation and adversarial sample defense is implemented.The classification information and the predicted value of the samples verify the experimental results of the three adversarial training algorithms proposed in gray-box attack,white-box attack and black-box attack.
Keywords/Search Tags:Adversarial examples, adversarial training, adversarial robust distillation, metric learning, adversarial defense platform
PDF Full Text Request
Related items