Font Size: a A A

Study On Intelligent Detection Of Synthetic Speech Based On Cepstral Coefficient

Posted on:2021-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:X J TuFull Text:PDF
GTID:2518306473974469Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The rapid development of speech synthesis technology poses a threat to automatic speaker verification systems.In order to enhance the security of the speaker verification system,how to distinguish between natural speech and synthetic speech needs to be studied.In this paper,based on the analysis of the existing synthetic speech detection technology,the synthetic speech detection technology based on cepstral coefficients feature is studied.The main research work of this paper is as follows:1.A synthetic speech detection algorithm based on Gammatone frequency modified group delay cepstral coefficients(GFMGDCC)is proposed.This method extracts GFMGCDC features from speech signals and uses long short-term memory(LSTM)classifier to identify natural speech and synthetic speech.Experiments in three different types of synthetic speech compares the proposed GFMGDCC,inverted Mel frequency modified group delay cepstral coefficients(IMFMGDCC),inverted Gammatone frequency modified group delay cepstral coefficients(IGFMGDCC)and the existing Mel frequency modified group delay cepstral coefficients(MFMGDCC)of four kinds of features respectively in the use of convolution neural networks(CNN)and LSTM as the classifier to synthetic speech detection performance.The experimental results show that the detection algorithm based on GFMGDCC features used in this paper has a good detection effect for synthetic speech and obtains a better detection performance.2.A synthetic speech detection algorithm based on modified group delay cepstral coefficient(MGDCC)and constant Q cepstral coefficients(CQCC)is also proposed.In this method,MGDCC-CQCC combined feature is used as the extraction feature of speech,and combined with the characteristics of CNN and LSTM network,CNN-LSTM is used as the classifier for classification detection.Three different types of synthetic speech were used for experiments.The proposed MGCDC-CQCC features,the MGCDC and CQCC features were compared to the CNN,LSTM,and CNN-LSTM classifiers for their performance in detecting synthetic speech.Experimental results show that the synthetic speech detection algorithm based on MGCDC-CQCC proposed in this paper has achieved good detection performance,especially in the detection of synthetic speech generated by unit selection,and the equal error rate(EER)can be reduced to 2.65%.
Keywords/Search Tags:Synthetic Speech Detection, GFMGDCC, IGFMGDCC, CQCC, CNN, LSTM
PDF Full Text Request
Related items