Font Size: a A A

A Method For Detection Of Spoofing Speech Based On Joint Features And Random Forest

Posted on:2022-12-30Degree:MasterType:Thesis
Country:ChinaCandidate:J Q YuFull Text:PDF
GTID:2518306605497584Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In recent years,biometric technology has made continuous progress,among which automatic speaker verification is one of the biometric technologies.However the automatic speaker verification systems are vulnerable to malicious spoofing attacks.As the automatic speaker verification technology continues to mature,it will be used in many fields in the future.Therefore,it is necessary to provide reliable technology for the automatic speaker verification system to deal with spoofing attacks.Generally speaking,the main attacks to the speaker verification system are artificial imitation,recording replay,speech synthesis and voice conversion.This thesis mainly focuses on the two types of spoofing attacks,speech synthesis and voice conversion.The detection of disguised speech usually determines whether the detected speech signal is a real speech by analyzing and comparing the acoustic characteristics of the real speech signal and the disguised speech signal.Acoustic features are usually used as the main feature parameters to distinguish between real speech and disguised speech in current research.Among them,Mel Frequency Cepstral Coefficients(MFCC)and the Constant Q Cepstral Coefficients(CQCC)based on constant Q transform proposed in recent years.MFCC and CQCC are typical acoustic features of voice signals,but acoustic features cannot fully describe the feature information of voice signals.Therefore,this thesis introduces texture features to optimize the feature parameters in the camouflage voice detection system.This thesis proposes a method of camouflage speech detection based on the combination of Uniform Local Binary Pattern(ULBP)texture features and CQCC acoustic features.In this thesis,ULBP is used to extract the texture feature vector in the speech signal spectrogram and combined with the acoustic feature vector,and the obtained combined feature vector is used to train the classifier,so as to realize the disguised speech detection.Considering that different classifiers have different matching degrees for various types of features,this thesis chooses the Random Forest(RF)model as the classifier for the disguised speech detection system to distinguish between real speech and disguised speech.The experimental results show that random forest model can better fit the joint features mentioned in this article,and the classification efficiency and performance when using the joint features are better than the commonly used support vector machine classifiers,and improves the effect of disguised speech detection.The experimental results also show that the performance of the fake speech detection method based on joint features is better than that of using a single feature.In order to further enhance the texture features of the speech signal spectrogram,this thesis uses global texture analysis to optimize the texture features,and proposes a new method of joint feature for camouflage speech detection,CQCC-ULBP-GLCM,by introducing the gray-level co-occurrence matrix(GLCM).The method extracts the global texture feature vector in the spectrogram and combines it with the ULBP feature vector and CQCC to obtain the CQCC-ULBP-GLCM feature vector.The gray-level co-occurrence matrix feature can provide the gray-level information of the spectrogram with respect to the direction angle,change amplitude and local neighborhood distribution,and supplement the global texture feature information for the texture feature.Experimental results show that GLCM improves the camouflage voice detection method based on the CQCC-ULBP joint feature,and further improves the performance of the camouflage voice detection system.At the same time,the disguised speech detection method based on CQCC-ULBP-GLCM has better noise robustness when dealing with spoofing attacks,and can better deal with noise interference.
Keywords/Search Tags:spoofing speech detection, acoustic feature, texture features, uniform local binary pattern, gray-level co-occurrence matrix, random forest, support vector machine
PDF Full Text Request
Related items