Font Size: a A A

Research On Score Fusion Based On Attack Methods And Replay Configuration In Speaker Verification Anti-spoofing

Posted on:2021-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:H TangFull Text:PDF
GTID:2428330620468759Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Automatic speaker verification(ASV)is a belong to biometric technology,which automatically determines the identity of the speaker by his or her voice signal.In the actual authentication scenario,the fraudster uses speech conversion,speech synthesis algorithms or recording replay to obtain the spoofed speech that is very similar to the real speech.At present,it is difficult for ASV system to detect the subtle difference between spoofing and genuine speech.Spoofing speech can easily pass through ASV system,which poses a serious threat to the security of ASV system.Therefore,speaker verification anti-spoofing technology has received much attention in recent years.This essay mainly studies score fusion of Gaussian Mixture Model(GMM),I-vector and Light Convolution Neural Network(LCNN)models for speaker verification anti-spoofing.In order to improve the accuracy and robustness of the speaker verification anti-spoofing model,we propose to score fusion of different models for GMM,I-vector and LCNN models which are based on different spoofing algorithms methods and replay configurations.The experiments are carried on ASVspoof challenge data set,main work of this essay is summarized as follows:Firstly,Three models(probability normalization,linear regression and support vector machine)are used for the GMM models based on different spoofing algorithms methods and scenarios.Experimental results on ASVspoof2015 and ASVspoof2019 challenge data sets show that the performance of GMM with SVM model based on different attack methods and scenarios is significantly improved compared with the baseline GMM model.Secondly,the essay takes Probability Linear Discriminant Analysis(PLDA),Support Vector Machine(SVM)and cosine distance model as the back-end of i-vector model,and makes analysis and comparison on typical i-vector model and i-vector model based on different attack methods and replay configurations respectively.Experiments results show that based on SVM model is better than PLDA and cosine distance models.Then,The i-vector model with cosine distance scoring uses probability normalization,linear regression and support vector machine in different attack methods and replay configurations.Experimental results show that the performance of i-vector model based on cosine distance scoring is further improvedafter fusion with SVM.Finally,we propose to use the Speaker Embedding of the LCNN model as a new feature to replace the i-vector,and use PLDA,SVM and cosine distance to score.Experimental results show that LCNN model based on different attack methods and replay configurations performs best after using SVM.Then we propose that Embedding based on cosine distance scoring uses probability normalization,linear regression and SVM for score fusion in different attack methods and replay configurations.Experimental results show that using SVM for score fusion can further improve the performance of the model.
Keywords/Search Tags:Speaker verification, score fusion, GMM model, i-vector model, LCNN model
PDF Full Text Request
Related items