Font Size: a A A

Research On Backdoor Attack In Neural Networks Based On Trigger Feature

Posted on:2023-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:S P BianFull Text:PDF
GTID:2568307061954079Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,deep neural networks have developed rapidly and have been widely used in many fields such as image recognition,natural language processing and automatic driving.However,due to the lack of interpretability of neural networks model,while we enjoy the convenience brought by neural networks,security threats and challenges also follow.With the complexity of neural networks application scenarios upgrading,training these models requires certain professional knowledge,and a lot of energy should be spent on data collection and optimization.Therefore,many users choose to outsource the model training process to thirdparty model service providers.However,the outsourcing of training services may introduce serious security issues,such as backdoor attack.A malicious model service would embed malicious behavior into the model by adding a percentage of "carefully crafted" poisoning data to the training set provided by users.When the model training is completed,the model service provider as the attacker only needs to use special patterns to trigger the malicious behavior in the model and affect the decision result of the model.In view of the security problems mentioned above,this thesis conducts in-depth research on backdoor attack in neural networks in the field of image recognition,including the following three aspects:(1)In order to solve the problems such as too many data samples required by existing backdoor detection methods and low detection efficiency,this thesis studies and proposes a backdoor detection method for static backdoor attack,which can achieve efficient and fast backdoor detection and only need a small number of data samples to participate in the calculation to achieve high detection accuracy.This method consists of four parts: Firstly,this method uses the sensitivity of the backdoor to noise data to narrow the detection tag range in a short time;Secondly,a few doubtful labels are solved based on the gradient descent algorithm,and the corresponding decision shortcut triggers of labels are obtained;Then,according to the obtained decision shortcut triggers,anomaly analysis is carried out to judge whether there is a backdoor in the model;Finally,for the model with a backdoor,the model is retrained by the shortcut trigger to make the backdoor invalid without affecting the normal function of the model.In this thesis,multiple model structures are used to verify the validity of the detection method on Youtube Face,GTSRB and CIFAR10 datasets.Compared with the two most effective backdoor detection methods,Neural Cleanse and ABS,the detection accuracy and efficiency are improved significantly.The detection time on the same task was reduced by about 90%.(2)The existing backdoor methods generally use fixed mode to trigger malicious behavior in the model,which belong to the static backdoor methods.To solve the questions that the existing backdoor methods cannot make the model learn trigger pixels and the bad concealment of the backdoor,this thesis proposes a dynamic backdoor attack method for pixel values.This method can implant two different trigger modes into the target model,and a single image sample can only use one trigger mode to attack,using another one is invalid.This method consists of four parts: Firstly,two maximally-differentiated pixel patterns are solved for the target model based on the gradient descent algorithm;Secondly,the activation value of the convolution layer of the target model is used to train the clustering partition model,and the corresponding rules between samples and triggers are developed by it;Then,in order to realize the dynamics of the backdoor,three kinds of poisoning data samples are made,which are correct poisoning samples,wrong poisoning samples and noise poisoning samples;Finally,the target model should be retrained with three kinds of poisoned samples and clean samples,and the dynamic backdoor can be implanted without affecting the normal function of the model.In this thesis,multiple model structures are used to verify the effectiveness and concealability of the attack method on Youtube Face,GTSRB and CIFAR10 datasets,in addition,the dynamic backdoor cannot be detected by using the existing backdoor detection methods such as Neural Cleanse and ABS.(3)Based on the research achievements above,a model security detection and dynamic watermark embedding toolkit is designed and implemented,and provides the static backdoor security detection and dynamic backdoor watermark embedding service under Web pages by using Django framework based on this toolkit.Users can test the models provided by thirdparty model service providers to ensure the security of the models,and enterprises can implant dynamic backdoor watermarks on the business models,so as to protect the intellectual property rights of the business models and avoid plagiarism and secondary transmission of the models.Finally,this thesis verifies the function of the toolkit through the services provided under the Web page and gives application examples.To sum up,this thesis conducts in-depth research on the security of backdoor in neural networks.This thesis proposes a backdoor detection method for static backdoor and a dynamic backdoor attack method for pixel values,and finally provides a toolkit for model security detection and dynamic watermark embedding,which provides a strong technical support for model security protection.The research work in this thesis is of great significance to the development of backdoor attack and the security of neural networks.
Keywords/Search Tags:Backdoor attack, Neural Trojan, Data poisoning, Neural networks, Image recognition
PDF Full Text Request
Related items