Font Size: a A A

Backdoor Attacks And Defenses On Deep Neural Networks

Posted on:2023-08-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:S F LiFull Text:PDF
GTID:1528307298988409Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the increasing computing power that modern information technology can provide,more and more data needs to be processed,which provides the possibility for breakthroughs in Deep Learning technology.Since 2006,Deep Neural Networks(DNNs)have developed rapidly and are widely integrated with applications such as the mobile Internet and the Internet of Things.These rich application scenarios have in turn stimulated a new wave of Deep Learning development.With the rapid development of Deep Learning technology and its wide application in various industries,its security has always been widely concerned.However,the limitations of interpretability and strong data dependence presented by DNNs have brought severe security risks to its application in various scenarios.There are various security issues in the entire life cycle of DNN models from training to deployment.In the training phase of DNN models,there is a risk of training data being contaminated.This type of attack is called a data poisoning attack.This attack aims to generate malicious disturbances in the decision boundary of the model,which makes the trained model inherently flawed.In the deployment phase of the model,because the neural network decisionmaking process is sensitive to small perturbations,attackers can compromise the output of DNN models via a subtly crafted perturbation.DNN models may also leak important private information of users.Through in-depth mining and correlation analysis of various types of information such as inputs and outputs of a given model,attackers can restore the user’s private data used when training the model.In addition,the trained DNN models also have the risk of stealing through black-box query,side channel analysis on its deployed environment,which will cause serious intellectual property(IP)disputes and other problems.Among the many security threats faced by DNNs,the backdoor attack of DNNs is a special type of data poisoning attack mentioned above.This attack surface will only have a slight impact on the original decision boundary of DNN models,so it will not affect the functionality of normal users.However,it will create a “shortcut " between two decision boundaries through the backdoor feature.When the backdoor feature appears in the input,the “shortcut" in the model is activated.Thus,the model will ignore other input features and only focus on the backdoor feature,resulting in the model behavior as the attacker expected.This dissertation studies the backdoor attacks and defenses of DNN models,and aims to improve the invisibility of the backdoor trigger to evade human inspection.In particular,this dissertation designs a new type of invisible backdoor attack in the two research fields including Computer Vision(CV)system and Natural Language Processing(NLP)system.Finally,to prevent the Federated Learning(FL)systems from backdoor attacks mentioned above,this dissertation models the backdoor detection of FL framework as a cooperative game,and proposes a detection framework based on Shapley value to identify the backdoor attackers in FL systems.Its main contributions are as follows.First of all,as the poor invisibility and insufficient concealment of the backdoor attacks in the image classification systems,this dissertation proposes and designs two new types of invisible backdoor attacks against those DNN models.Based on the observation that DNN models are vulnerable to small perturbations,this dissertation adopts steganography and regularization to enhance the invisibility of backdoor triggers.Based on image structure similarity(SSIM)and the perception similarity(LPIPS),this dissertation proposes two measurements to quantitatively measure the invisibility of backdook triggers.The invisible backdoor attacks proposed in this dissertation balance the invisibility and the attack success rate,resulting in better concealment and higher practicability.This dissertation also evaluates the effectiveness of the two types of invisible backdoor attacks under the State-Of-The-Art backdoor detection approaches.In the last,the dissertation discusses the corresponding defense technology against proposed invisible backdoor attacks.Secondly,for the Natural Language Processing(NLP)system,it is difficult to design and insert a general backdoor in a manner imperceptible to humans.The input sequences of words have a temporal correlation and are drawn from a discrete space.Any corruption in the textual data(e.g.,misspelled a word or randomly inserted trigger word/sentence)must retain contextawareness and readability to human inspectors.This dissertation,for the first time,proposes two novel hidden backdoor attacks,named homograph attack and dynamic sentence attack,towards three major NLP tasks,including toxic comment detection,neural machine translation,and question answering,depending on whether the targeted NLP platform accepts raw Unicode characters.For the NLP platforms that accept raw Unicode characters as legitimate input,a novel homograph backdoor attack is presented by adopting a character-level trigger based on visual spoofing homographs.As for NLP systems which do not accept Unicode homographs,this dissertation proposes a more advanced hidden backdoor attack,dynamic sentence backdoor attack,by leveraging highly natural and fluent sentences generated by language models to serve as the backdoor trigger.With these techniques,poisoned textual data will have the same readability as the original input data while producing a strong backdoor signal for backdoor complex language models.These multiple avenues of attacks,constituting a broad and diverse attack surface,present a more serious threat to human-centric language models.In the last,the dissertation discusses the corresponding defense technology against proposed backdoor attacks.Finally,as the emerged distributed training framework(Federated Learning)has advantages in preserving users’ privacy.It has been widely used in electronic medical applications,however,it also faced threats derived from backdoor attacks.To thwart the aforementioned attack,this dissertation proposes a novel backdoor detection framework in FL-based e-Health systems.By modeling the FL-based e-Health system as a cooperative game,the contribution of each participant to the performance of the aggregated model can be measured by Shapley value.In other words,a benign participant intends to contribute its share of the local parameter to improve the functionality of the models in the long term while a backdoor attacker aims at degrading the functionality of the models,which can be measured by Shapley value to classify the backdoor attacker from benign participants.This dissertation demonstrates the efficacy and efficiency of proposed detection framework on a variety of machine learning tasks including an image based breast cancer identification system and a textual data based Alzheimer’s disease detection system.
Keywords/Search Tags:AI Security, Backdoor Attack, Poisoning Attack, Natural Language Processing, Federated Learning
PDF Full Text Request
Related items