Font Size: a A A

Research Of Feature Parameters In Playback Speech Detection

Posted on:2021-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhouFull Text:PDF
GTID:2428330614461599Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Compared with other biometric recognition technologies,speaker recognition technology has the advantages of non-contact,no memory,fast acquisition process and so on.It has become a commonly used verification method.However,with the emergence of various recording devices,it is easier for attackers to use playback speech attack authentication system.Therefore,how to prevent the attack of playback speech has become the focus of speaker recognition technology research.Based on the database of ASVspoof 2017 challenge,this paper analyzes the playback speech produced by different kinds of recording equipment and playback equipment,focuses on the extraction of speech feature parameters.In the traditional feature extraction methods,the feature information of speech spectrum has not been fully extracted and utilized,and the method of strengthening spectrum information of low-frequency layer is adopted,which affects the performance of attack detection.In fact,the main difference between original speech and playback speech is reflected in the high-frequency layer.In view of the shortcomings of existing technology,the fusion feature detection algorithm based on Fisher criterion and the cepstrum feature detection algorithm based on multi-layer filter are proposed to better detect the playback speech.The main contents of this paper are as follows:1.The influence of pre-processing on the signal is analyzed in detail and the current mainstream speech feature parameters such as MFCC,LFCC,LPC,IMFCC and CQCC are studied and implemented,which lays the foundation for improving the feature parameters.Three classical classification models: Gaussian Mixture Model,Support Vector Machine and Gaussian Mixture Model-Universal Background Model are studied.And many models are tested.The results show that GMM has the highest recognition performance.2.Explore the distinguishing features of original speech and playback speech from multiple angles,such as waveform,spectrogram and frequency information.And further understand the essential differences between playback speech and original speech.Through experimental analysis,it is found that the differences between the two kinds of speech are more in highfrequency layer,the differences in the low-frequency layer are small and are easily affected by the type of equipment.3.Combined with the characteristics of MFCC,LFCC,IMFCC feature parameters and Fisher criterion,the feature components with better distinguishing ability are selected.The fusion feature detection algorithm based on Fisher criterion is proposed.The experimental comparison is made from the perspective of different Gaussian orders,different feature parameters,combination and time complexity.It is proved that the algorithm can effectively improve the detection effect and running efficiency of the system.4.In view of the difference in the frequency spectrum,the Inverse-Mel filter is used in the high-frequency layer to enhance the extraction of speaker information,highlighting the differences.In the low-frequency layer,the combination of linear filter and Mel filter is used to avoid the superposition of feature parameters.The L-M-I filter bank is obtained by multi-layer filter fusion,then a new cepstrum feature is formed.The paper explores the effects of Pre-emphasis coefficients,dynamic characteristics,CMVN and Gaussian order on the detection results,and prove the feasibility and effectiveness of the algorithm.The experimental results show that when the classifier is GMM,the multi-layer filter detection algorithm has the best detection effect.In the test data set,the equal error rate is 2.57%,compared with MFCC,CQCC,LFCC,IMFCC and L-I,it decreased by 12.86%,9.66%,4.51%,3.33%,1.63%,respectively.It has the stable detection effect in SVM and Ada Boost classifiers.Finally,the playback speech detection algorithm combined with the speaker confirmation system,it can effectively resist playback speech attacks.
Keywords/Search Tags:Playback Speech Detection, Gaussian Mixture Model, Fisher Criterion, Improved Filter Bank, Cepstrum Feature
PDF Full Text Request
Related items