| While deep learning has been fully applied in various fields,it also faces many security threats in its training and inference phases.Neural network backdoor attacks are a typical type of deep learning-oriented attacks,which can give the attacker the ability to manipulate the model output under certain conditions,and are extremely stealthy and destructive.By employing data poisoning,model editing or migration learning in the training phase,the attacker implants an illegal backdoor into the deep neural network model,so that when the backdoor trigger appears in the inference phase,the model output will be skewed according to the attacker’s intention.Therefore,effective defense against neural network backdoor attacks is one of the important tasks to ensure the security of intelligent services and one of the important problems in intelligent algorithm countermeasure research.In this thesis,for the image classification task,we transform and analyze the deep features of samples in deep neural network models,make another layer of abstraction of deep features to refine meta-features,and design and implement two neural network backdoor defense methods based on meta-features analysis to cope with different backdoor attack and defense scenarios.The specific research work is as follows:(1)A meta-feature extraction and analysis method for image data is proposed.The metafeatures are abstracted by deep features of the samples in the deep neural network model to amplify the abnormal behavioral features of the samples in the backdoor neural network model.Meanwhile,seven different distance similarity algorithms are used to explore the similarity of meta-features between samples.The experimental results show that the earth mover’s distance,the Bhattacharyya distance,and cosine distance can achieve better results.(2)An input-level backdoor defense method based on meta-feature analysis is proposed for the attack and defense scenarios in which a third party provides a neural network model.The method uses clean verification sets to generate benchmark meta-features and distance constraints in the offline preparation phase,thus realizing real-time triggered input detection in the model operation phase.In all experiments,the detection accuracy of the cosine distance similarity algorithm can reach over 97% on average,and the false detection rate is only around1%.The defense effect is remarkable,which can effectively prevent backdoor activation and protect the neural network model.(3)A dataset-level backdoor defense method based on meta-feature analysis is proposed for the attack and defense scenarios in which a third party provides training datasets.The method eliminates the hidden neural network backdoor attacks in the dataset from the source by cleaning the backdoor data in the unknown training dataset through secondary screening,and prevents the neural network model from being implanted with a backdoor.The nonblindness and effectiveness of the method in defending against backdoor attacks are verified by setting up comparison experiments.For the secondary screening of the dataset,the backdoor data detection rate can reach 98.82% when the false detection rate is only 0.39%.For the defended neural network model,the backdoor attack success rate can be reduced to 1.23%.In addition,this defense method does not cause performance loss of the neural network model. |