| With the rapid development of deep learning,deep learning is continuously being applied in the field of natural language processing(NLP)and deployed in the real world.But deep learning-based NLP models are facing the threat of backdoor attacks.The feature space backdoor attacks use specific sentence patterns as the backdoor triggers,and the generated poisoning samples are more natural and fluent.However,the trigger pattern of the current feature space backdoor attacks is single,and the fluency and semantics of the generated poisoned samples are insufficiently preserved.Aiming at these problems,two feature space backdoor attack methods are proposed in this paper,namely,multi-style transfer-based backdoor attack and paraphrase-based backdoor attack.The main research work and results include:1.A backdoor attack method called multi-style transfer-based backdoor attack is proposed.This paper finds through experiments that current feature space backdoor attacks rely on language models that generate poisoned samples.Therefore,in view of the single trigger pattern of the current feature space backdoor attack,this paper uses multiple text styles as the backdoor trigger,which improves the diversity and concealment of the backdoor trigger.Experimental results show that this backdoor attack can achieve good attack performance and resistance to backdoor defenses,and the poisoning samples it generates are fluent and natural.2.A backdoor attack method called paraphrase-based backdoor attack is proposed.Aiming at the problem that the current feature space backdoor attack methods loses part of the fluency and semantic preservation of poisoned samples in order to improve the accuracy of generating sentences with specific patterns,this paper uses the same features of the sentences generated by the text paraphrase model as the backdoor trigger to improve the quality of poisoned samples.At the same time,to improve the classification performance of the attacked model on clean samples,during the backdoor attack process,the clean samples corresponding to the poisoned samples are added back to the backdoor training set.Experimental results show that this backdoor attack can achieve good attack performance and resistance to backdoor defense,and more importantly,the fluency and semantic preservation of the poisoned samples it generates are higher.Main contributions: Using multiple text styles as backdoor triggers,and using the style transfer model to implement a backdoor attack method called multi-style transfer-based backdoor.Using the same feature of the sentences generated by the text paraphrase model as the backdoor trigger,and using the text paraphrase model to implement a backdoor attack method called paraphrase-based backdoor attack. |