Font Size: a A A

Research On Voiceprint Attack And Defense Based On Adversarial Samples

Posted on:2024-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:S LiFull Text:PDF
GTID:2556307109976999Subject:Cyberspace security law enforcement technology
Abstract/Summary:PDF Full Text Request
In recent years,deep learning techniques have been gradually applied to the field of voice recognition,which has greatly improved the efficiency and quality of voice recognition and promoted the development of voice recognition technology.However,deep neural networks are vulnerable to adversarial attacks,giving rise to adversarial attacks against voiceprint recognition systems.Voiceprint adversarial attacks deceive voiceprint recognition models into incorrectly identifying the speaker’s identity by adding imperceptible perturbations to the target speech.This poses a serious threat to the security of voiceprint recognition systems.Therefore,to protect the security of voiceprint recognition systems and prevent them from being threatened by adversarial samples,this paper conducts research on voiceprint attacks and defense based on adversarial samples.Firstly,we analyze the existing voiceprint adversarial attack algorithms and improve them to enhance our understanding of potential adversarial attack algorithms.Then,we design voiceprint adversarial defense solutions and optimize the adversarial detection model to improve the model’s robustness and defense effectiveness against adversarial samples.The specific work is as follows:1.In terms of voiceprint adversarial attacks,a Space-Time Iterative Fast Gradient Sign Method(STI-FGSM)is proposed for the speaker recognition model to solve the problems of insufficient use of gradient information and poor transferability of current voiceprint adversarial attack algorithms.The algorithm fuses momentum and timing gradient information firstly based on the MI-FGSM and uses the next observation gradient to correct the disturbance update direction.Then,the spatial gradient information is introduced to fully learn the region information of the speech samples and realize the accumulation of spatial gradient momentum in different regions.Finally,the perturbation ensemble method is combined to fully use known white-box models to achieve a multi-model perturbation ensemble and further improve the black-box attack success rate.The experimental results show that the STI-FGSM algorithm achieves a strong white-box attack and high black-box attack success rate against four speaker recognition models,Res Net SE34V2,TDy_Res Net34_half,x-vector,and ECAPA-TDNN.The performance is better than other algorithms.2.In terms of voiceprint adversarial defense,a voice adversarial sample detection model e_Xception is proposed to solve the problems of excessive parameter size and poor robustness of existing voice adversarial sample detection methods.Xception is taken as the backbone network and embeds Efficient Channel Attention(ECA)modules to fully extract speech features.A lightweight network model e_half Xception is designed to reduce the parameters’ number while still maintaining high accuracy by reasonably reducing the width of the network model.Finally,a high-frequency masked speech data enhancement strategy HF-Mask is proposed to improve the model’s generalization.Experiments demonstrate that high accuracy is achieved in the detection of seven adversarial samples,FGSM,BIM,PGD,MI-FGSM,C&W,FAKEBOB,and STI-FGSM,outperforming other detection methods,and the robustness of the model is investigated unknown attack algorithms,unknown target models,and unknown perturbation degrees,validating the model’s generalization.
Keywords/Search Tags:Voiceprint recognition, Adversarial samples, Adversarial attack, Adversarial defense, Robustness
PDF Full Text Request
Related items