Font Size: a A A

Research And Implementation Of Highly Robust Replay Speech Detection Method

Posted on:2021-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:J HuFull Text:PDF
GTID:2518306461954139Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Voiceprint authentication is one of the most popular biometric technologies,which has the unique advantage of remote authentication and a broad application prospect.However,with the rapid development of voiceprint authentication technology,attacks of deception are becoming increasingly threatening,which brings a huge challenge to the security and reliability of the voiceprint authentication system.In these spoofing attacks,replay attack is more threatening for the voiceprint authentication system.The main reasons are that it comes directly from the target speaker,and its acquisition cost is low.Moreover,it does not need the attacker to have relevant professional knowledge and technology.Therefore,how to propose a highly robust replay attack detection method is the key to improve the security of the voiceprint authentication system.According to the current research status of replay detection at home and abroad,this paper summarizes the key technologies of the existing replay attack.In view of the problems such as the lack of robustness and poor detection performance in the current replay attack detection algorithms,this paper expands the research work of the following three parts:Firstly,thoroughly analyzing the generation of replay speech,find that the generation process is mainly affected by the acoustic environment and the quality of playback equipment,which causes a distinct difference from the genuine speech.This subtle difference leads to varying degrees of distortion in the generated replay speech.After a detailed exploration of the factors that affect the distortion of the replay speech in the replay configuration,the experimental results illustrate that the quality of the playback device and the distance from the attacker to the speaker during the recording are the most critical factors for the replay speech quality.Secondly,to enhance the robustness of the existing methods,we propose a replay attack detection algorithm based on full-band frequency cepstrum coefficients.The research shows that some detailed information on the spectrum will be lost when using filter banks to reduce the dimension of cepstrum features.In order to retain more detailed information on the spectrum,this paper proposes a feature based-on full-band frequency cepstrum coefficients,which first obtains the spectrum coefficients of the audio signal by short-time Fourier transform,and then receives the fullband frequency cepstrum coefficients feature by DCT transform.Experimental results show that our proposed algorithm has better detection performance.Compared with the two baseline systems of ASVspoof 2019,our algorithm performance has been improved by 19% and 34%,respectively.Finally,to further enhance the poor detection performance of the handcraft feature,we propose a replay attack detection algorithm based on depth feature.It is found that the traditional handcraft feature cannot represent the global speech information well,which makes its performance on the complex database far worse than the performance of the depth feature.Therefore,this paper proposes a method to combine the traditional handcraft feature with the deep residual network.The network extracts the more in-depth feature of the audio signal by taking the handcraft cepstrum features as the input.And the fusion of the models achieves excellent performance on the development set and evaluation set.The experimental results show that the performance of our algorithm is improved by79% and 73%,respectively,compared with the two baseline systems.
Keywords/Search Tags:Voiceprint recognition, Replay speech, Full-band frequency cepstrum coefficients, Residual network, Depth features
PDF Full Text Request
Related items