Research On Speaker Recognition In Distracting Environments

Posted on:2021-08-25

Degree:Master

Type:Thesis

Country:China

Candidate:M L Yang

Full Text:PDF

GTID:2518306200953099

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Speaker recognition is a technology that uses speech to identify the identity of the speaker..In recent years,speaker recognition methods based on i-vector and x-vector have been developed,but both of them are independent of background noise,and the influence of interference environment on speaker recognition performance is not fully considered,which leads to poor speaker recognition performance in all kinds of practical application scenarios.Although the classical speech enhancement methods,such as spectral subtraction,have been effectively applied in speech recognition and language recognition,the performance of speaker recognition is different from it.While suppressing the background noise,it also causes great damage to the acoustic structure of the speaker's speech,resulting in the unsatisfactory performance of the noise-suppressed speaker recognition.Therefore,this paper focuses on the robustness of speaker recognition in interference environment.Firstly,the speaker recognition performance of three different speech features,filter bank coefficient(Fbank),Mel frequency cepstrum coefficient(MFCC)and perceptual linear prediction coefficient(PLP),on i-vector and x-vector speaker recognition models are studied,and the speaker features are screened.Secondly,in order to solve the problem that the traditional spectral subtraction not only suppresses the background noise,but also destroys the acoustic structure of speaker speech,which restricts the performance of speaker recognition,the construction of deep neural network(DNN)speech enhancement is proposed as the pre-processing unit of speaker recognition to reduce the influence of interference environment on speaker recognition.Finally,in order to make up for the distortion of speaker speech caused by speech enhancement,a generative adversarial network(GAN)is constructed on the basis of DNN speech enhancement network as the pre-processing unit of speaker recognition to expand the data of registered speaker to enhance the identity feature vector of registered speaker.finally,a speaker recognition model based on DNN denoising and identity vector enhancement is obtained.The performance of speaker recognition in interference environment is further improved.The test results of speaker recognition in multi-type interference environmentshow that under the condition of noisy noise,the average performance index of speaker recognition based on DNN denoising and identity vector enhancement is improved by 61.92% and 20.32% respectively compared with the speaker recognition equal error rate(EER)and minimum detection cost function(Mindcf16)of x-vector baseline model.Under the condition of factory noise interference,the average performance index of EER and Mindcf16 of the proposed method is improved by48.15% and 11.45% respectively compared with the baseline model.Under the condition of music noise interference,the average performance index of EER and Mindcf16 of the proposed method is improved by 55.00% and 18.21%,respectively.Under the condition of traffic noise,the average performance index of EER and Mindcf16 of the proposed method is improved by 56.46% and 20.69% respectively compared with the baseline model.To sum up,the algorithm model proposed in this paper significantly improves the performance of speaker recognition in interference environment.

Keywords/Search Tags:

speaker recognition, DNN denoising, generative adversarial network, i-vector, x-vector

PDF Full Text Request

Related items

1	The Research Of The Speaker Recognition System Using Low-Dimensional Vector Representations
2	Research On Speaker Recognition Based On Deep Belief Network And Vector Quantization
3	Many-to-Many Voice Conversion Algorithm Based On Dense Net Star Generative Adversarial Network Combining I-vector For Non-parallel Corpora
4	Research On Generative Adversarial Network Of Facial Image Generation Based On Self-encoder Structure
5	Research And Development Of Speaker Recognition System
6	Research Of Speaker Recognition Based On I-vector
7	Research On Speaker Recognition Of The Robustness Based On I-vector
8	Studies On Speaker Recognition Based On SVM And GMM
9	Research On Classification Method Of Imbalanced Data Set Based On Generative Adversarial Network
10	Research And Implementation Of Dark Image Denoising And Enhancement Algorithm Based On Generative Adversarial Networks