Attacks And Defenses For Deep Learning Models | Posted on:2023-05-10 | Degree:Doctor | Type:Dissertation | Country:China | Candidate:C Zhang | Full Text:PDF | GTID:1528307097474584 | Subject:Computer Science and Technology | Abstract/Summary: | PDF Full Text Request | The vigorous development of deep learning makes it applicable to various fields in people’s daily life.However,attacks on deep learning models represented by adversarial examples and data poisoning also bring security threats to their applications.It is of great scientific significance and social value to study attack and defense algorithms for deep learning models and provide a complete and standardized evaluation method to ensure the security and reliability of artificial intelligence systems in industrial applications.Focusing on the two significant problems for adversarial attacks with defenses and poisoning attacks with defenses of deep learning models,this paper proposes a series of research on attack and defense algorithms to improve the security of artificial intelligence systems.The main work and contributions are as follows:For the text classification problem,we propose a method for generating adversarial text examples under black-box conditions,which appear to humans to have the same semantics as the original sentences.The adversarial text is obtained by swapping the alphabetical order of words and exchanging characters by visually similar characters,combined with keyword extraction techniques to ensure that the number of changed words is as tiny as possible.The generated adversarial examples can successfully confuse multiple text classification models.Transfer-based attacks can also easily fool deep neural networks in a black-box setting.We propose a robust classification model against transfer attacks based on the variational autoencoder framework.The model simulates the data generation process through multivariate Gaussian distribution and deep neural network and can use Bayes’ theorem to maximize the lower bound of the log-likelihood of each class to predict the label of the data.The classification results of this model reach the state-of-the-art with significantly better robustness.Aiming at the relatively insufficient research on poisoning attacks of deep learning models,we propose a clean label poisoning attack using dominant image features.Each class of clean data will add an imperceptible perturbation called the “dominant image feature” of the base class.When the dominant image feature and the clean image are added as an input to the trained neural network model,the dominant image feature determines which category is classified.Therefore a well-trained deep neural network model on these poisoned data has extremely low accuracy on clean test data.Besides,our attack can also be utilized for personal privacy protection.Furthermore,the data poisoning attack proposed can be extended to a powerful clean-label backdoor attack.We explore adversarial attacks on clustering models in two scenarios: attacks at decisiontime and data poisoning attacks.We present the possibility of adversarial attacks on the clustering model and offer clean-label poisoning attacks on clustering models by adding small perturbations on all training data.Adversarial attacks in clustering models should receive more attention. | Keywords/Search Tags: | Adversarial examples, Poisoning attacks, Deep neural networks, Adversarial defense, Clustering, Text classification | PDF Full Text Request | Related items |
| |
|