Font Size: a A A

Research And Implementation Of Synthetic Speech Detection

Posted on:2022-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:S F WuFull Text:PDF
GTID:2518306524990619Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of informatization and intelligence today,more and more biometric-based identification technologies are applied in various fields.Among them,voiceprint,as a biometric feature that has both specificity and relative stability,is used in more and more authentications.Used in the system.With the development of speech synthesis technology,people can use computers to generate synthetic speech that is very similar to the speech spoken by natural persons.This technology improves the quality of automatic speech response services,but it also brings challenges to the security of the speaker authentication system.Criminals can easily obtain the user's personal voice information,and then use the synthesis algorithm to generate the corresponding synthesized voice,and then attack the user's telephone banking,access control and other equipment,which greatly endangers the user's life and property safety.The research goal of this thesis is to study the existing speech synthesis systems,analyze the defects of such systems and the difference between the synthesized speech and natural speech,and design new synthetic speech extraction algorithms and detection models.The model is optimized to make it more robust in noisy speech detection tasks.This thesis has mainly done the following research work:1.By analyzing the existing speech synthesis technology and its synthesis results,referring to the existing speech feature extraction technology,a feature extraction algorithm called Symmetric Mel Cepstral Coefficient(SMFCC)is designed,and compare the results through experiments.It is concluded that the algorithm has good performance on synthetic speech detection tasks.2.Designed and implemented an end-to-end synthetic speech detection model based on temporal convolutional network(EETCN).By optimizing and adjusting the parameters of EETCN,the optimal hyperparameter combination is obtained,and through experimental comparison with the Gaussian mixture model,deep neural network model and other synthetic speech detection systems,the results show that the EETCN model is capable of synthetic speech detection tasks.Good performance.3.Designed a synthetic speech detection model based on residual shrinkage(RSBUEETCN).The influence of noise on the performance of the EETCN model is studied,combined with the good performance of the deep residual shrinkage module in noise reduction,the EETCN model is optimized and improved.Through comparative experiments with Wiener filtering and MMSE speech enhancement algorithms,it is found that the RSBU-EETCN model has performance advantages in the detection task of noisy speech.4.Based on the research on synthetic speech detection technology and model construction,comprehensively using a number of software development technologies,we designed and implemented a prototype system for synthetic language detection services,implemented EETCN model and RSBU-EETCN model related algorithms,and provided models Training,model update and management,and synthetic speech intelligent detection services,etc.,and tested the system.
Keywords/Search Tags:Spoofing Detection, Speech Feature Extraction, Deep Learning
PDF Full Text Request
Related items