Backdoor attack is a kind of security threat to neural network.It can implant hidden functions without affecting the model’s performance on normal samples,so that when the input meets certain trigger conditions,the model will output wrong or malicious results.A face-recognition system,for example,has a backdoor built into it that allows it to identify a particular person when a particular logo appears in an input picture.Backdoor attack has serious harm to neural network.First,backdoor attacks can damage the function and performance of the neural network,resulting in mission-critical errors or failures of the model.For example,an autonomous driving system has a back door built into it that makes the system do something dangerous when it encounters a certain traffic sign.Secondly,backdoor attacks can reveal the privacy and sensitive information of the neural network,allowing the attacker to obtain the model or user data or knowledge.For example,a voice recognition system has been built with a backdoor that allows it to send what the user has said to an attacker when a certain voice is heard.Finally,backdoor attacks can damage the credibility and reputation of the neural network,causing users to distrust or dislike the model.For example,a recommendation system has a backdoor built into it that allows users to recommend inappropriate or offensive content when viewing certain types of content.Therefore,backdoor attack is a covert and dangerous adversarial attack,which needs to be highly regarded and guarded in the field of deep learning.In order to solve the above problems,a backdoor detection and mitigation protocol based on image steganography and model forgetting is proposed in our work.WISERNet network is equipped with a color image encryption deep steganography analyzer,which can detect the backdoor hidden behind the poisoned sample even if the embedding algorithm is unknown,and further input the poisoned sample into the infected model for backdoor removal.The experimental results show that for three typical backdoor attacks,our method is superior to the most advanced backdoor defense methods Fine-Pruning and ABL.Our scheme reduces the attack success rate on test data to nearly 0%,and the classification accuracy on clean samples is only slightly reduced by 3%.Aiming at the security problem of backdoor attack model in neural networks,the main work of this paper is as follows:Propose a backdoor defensing method based on secure image steganalysis.The poisoned image contains a trigger that can be considered as an additional perturbation,and the intensity value at the same location has a strong correlation between different color channels,while the trigger has a weak correlation between its channels.The protocol is proved valid whether the trigger is visible or invisible.Design a secure backdoor detecting and removing protocol.We design a novel protocol to achieve the goal by detecting the poisoned images in the training dataset based on the wider separate-then-reunion network regardless of whether the trigger is specific to the poisoned samples,and by retraining the model for backdoor unlearning with the detected poisoned images.Extensive experiments are conducted in the proposed protocol.We empirically show that our protocol is robust against three state-of-the-art backdoor attacks.Compared with the state-of-the-art backdoor defensing protocols,Fine-Pruning and ABL,our protocol reduces the success rate of backdoor attacks to nearly 0% on both target classification and face recognition tasks,and retains the accuracy after removing the backdoors. |