Font Size: a A A

Research On The Interpretable Of Adversarial Examples For Deep Learning

Posted on:2022-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:Z ZhangFull Text:PDF
GTID:2518306524490404Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In recent years,Deep Learning and its related models have been applied in many fields such as security,finance,transportation,medical treatment.It is the core technol-ogy of the intelligent information system,and its data is directly related to the effective-ness and availability of Deep Learning algorithms.Deep learning algorithms have good generalization ability,while it is also extremely vulnerable,which leaves it open to more incalculable attacks.The emergence of adversarial examples poses a serious threat to the security of Deep Learning models.Only when the adversarial examples are understood,can a more stable model be constructed.Therefore,based on the safety and availability of the deep learning model,this paper carries out an interpretability study on adversarial examples.The main work is as follows:(1)An interpretability hypothesis of adversarial examples is proposed.By analyzing the existing adversarial examples explanatory hypothesis and combining with the exper-imental observation,this paper proposes the adversarial sample explanatory hypothesis,that is,the difference between the model of decision space and effective space of the sample input is one of the causes of adversarial examples.Then by conducting two exper-iments to illustrate the relationship between the normal sample and the adversarial sample and to analyze the possibility of the hypothesis;(2)A detection algorithm of adversarial samples based on the difference in the in-terpretive graph is proposed.Combined with the visualization techniques such as class activation map and gradient map,adversarial examples' characteristic was analyzed from the perspective of sample interpretation.Due to the addition of adversarial disturbance,the outline information of adversarial samples in Guide Grad-CAM almost disappeared,which is quite different from the normal sample.Therefore,this paper proposes a detec-tion algorithm of adversarial samples,an algorithm verified by experiments and has better detection results;(3)An adversarial reduction method based on autoencoder is proposed.The ad-versarial reduction method is designed according to the proposed detection algorithm of adversarial samples.The feature that the autoencoder can establish is the mapping rela-tionship between the adversarial samples and the normal samples in the embedded space.The algorithm can detect the input samples restore them when the adversarial sample is detected.The feasibility of this method has been verified by experiments.In this paper,the above experiments are carried out with the commonly used Resnet50 as the target model and the widely used ImageNet as the data set.The results show that the average detection success rate of C&W against sample attack can reach 99.2%by implementing the detection algorithm of adversarial examples that came up within this paper.In addition,the average success rate of the adversarial reduction method based on autoencoder in restoring adversarial samples is 72.5%.In conclusion,the adversarial reduction method put forward in this paper is feasible and effective in resisting adversarial examples attacks.
Keywords/Search Tags:Deep Learning, Interpretability of Adversarial Examples, Class Activation Mapping, Adversarial Examples Detection, Adversarial Reduction
PDF Full Text Request
Related items