Font Size: a A A

Spoofing Speech Detection Research

Posted on:2019-05-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:H YuFull Text:PDF
GTID:1318330542995347Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
As a low-cost and flexible biometric solution to person authentication,au-tomatic speaker verification has been widely applied in telephone or network access control systems.With the development of speech synthesis technology,automatic speaker verification systems have encountered serious challenges of spoofing attacks.An impostor can easily use the speech synthesis technology to generate high quality speech signals similar to those of the target speaker.Many experimental results show that the existing automatic speaker verifica-tion systems are vulnerable to spoofing attacks,the spoofing speeches can not be distinguished from genuine speeches effectively.In order to solve this prob-lem,in this paper,we do some research about spoofing detection through the detection robustness,feature extraction and the detection classifiers.By fusing the spoofing detection system,the speaker verification system defense spoofing attacks effectively.There are three innovations of this paper.We analyzed the the effect of adding noise to spoofing detection and developed a adding-noise training method to improve the robustness of spoofing detection task.We introduced a new neural network based acoustic feature which are suitable for the spoof-ing detection task.A novel neural network based scoring method was proposed and the new method can remarkable improve the accuracy of spoofing detection task.It is well known that the performance of the automatic speaker verifica-tion(ASV)systems are significantly degraded in the noisy conditions.There-fore,it is a great interest to investigate the effect of noise on performance of a spoofing detection system.The experimental results show that the speech enhancement methods are not suitable for the spoofing detection task and will significantly reduce the spoofing detection accuracy.we also propose a adding-noise training method where spoofing detection models are trained with mix of clean and noisy version of clean data.This method can partly improve the spoofing detection accuracy on noisy conditions.In the feature extraction domain,we develop a new filter bank based cep-stral feature,deep neural network filter bank cepstral coefficients(DNN-FBCC).Different from normal cepstral features the filterbanks used for DNN-FBCC extraction is automatically generated by training a filter bank neural network(FBNN)using natural and synthetic speech.By adding restrictions on the train-ing rules,the learned weight matrix of FBNN is bandlimited and sorted by fre-quency,similar to the normal filter bank.Unlike the manually designed filter bank,the learned filter bank has different filter shapes in different channels,which can capture the differences between natural and synthetic speech more effectively.Experimental results show that spoofing detector trained by DNN-FBNN perform better than the state-of-the-art linear frequency triangle filter bank cepstral coefficients.In the classifier domain,Gaussian mixture model(GMM)and deep neural networks(DNN)are the two most popular types of classifiers used for spoofing detection.The loglikelihood ratios(LLR)generated by the difference of gen-uine and spoofing log-likelihoods of test speech are used as spoofing detection scores.Many published results showed that the LLR base GMMs perform bet-ter than DNN classifiers,especially on detecting unknown attacks.In this paper we train a five-layer DNN spoofing detection classifier using dynamic acoustic features and propose a novel,simple scoring method only using genuine speech likelihoods(GSL)for spoofing detection.We mathematically prove that the new GSL scoring method is more suitable for the spoofing detection task than the classical LLR scoring method.The experimental results on the ASVspoof 2015 database show that compared to the GMM-LLR method,the DNN-GSL method is able to significantly improve the spoofing detection accuracy and performs nearly 10 times better on the average equal error rate(EER)of all at-tack types.Fusing the DNN-GSL spoofing detector with an ASV system,the false acceptance rate(FAR)on unknown spoofing attacks reduces from 38.47%to less than 0.41%.
Keywords/Search Tags:speaker verification, spoofing detection, Gaussian mixture model, deep neural networks, DNN-FBCC, LLR, GSL
PDF Full Text Request
Related items