Font Size: a A A

Research On Robustness Of Speaker Recognition In Noisy Environment

Posted on:2019-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:H R ZhangFull Text:PDF
GTID:2428330566999273Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Speaker recognition,also known as voiceprint recognition,is a process of automatically identifying or verifying speaker identity based on the feature information contained in the target speaker's voice.Speaker recognition technology enables service systems to control users' access to restricted services(such as automated banking services),information resources(depending on the user's access rights),or regions(such as government or research institutions)through voice.In addition,it can be used for speaker detection such as voice-based information retrieval in audio archives,forensic analysis to identify criminals,and personalization of user equipment.After years of research,the current speaker recognition system has achieved quite satisfactory results.However,the problem of noise robustness of the system in practical application is still to be solved,which is also a major obstacle to the daily application of the speaker recognition technology.This article has done the following research to this question:First of all,the noise robustness of principal component analysis(PCA)-processed GFCC features applied to speaker recognition is analyzed.Under noisy environment of white,babble and destroyerops with different SNR,the system performance of MFCC and GFCC is analyzed and compared.In addition,the PCA algorithm is used in the pretreatment of GFCC,and experimental analysis of the performance of the treated GFCC is performed.The experimental results show that PCA improves the system performance of GFCC under low signal-to-noise ratio to a certain extent.Afterwards,it introduces the i-vector / PLDA technology framework that has been outstanding in all kinds of evaluation at present.It also discusses the basic principle and extraction process of i-vector,the factor analysis theory of G-PLDA model,i-vector channel or noise compensation methods.The speaker recognition systems based on GMM-UBM and i-vector / PLDA are respectively constructed.The performances of the two systems in noisy environment are analyzed and compared.The compensation methods of i-vector channel such as LDA transform,Length regularity and data whitening etc.into the system,the experimental results show that the channel compensation technology has greatly improved system performance.Finally,a method of applying regression analysis model based on deep neural network(DNN)feature mapping to i-vector / PLDA speaker system model is proposed in this paper.DNN obtains the approximate representation of the pure speech i-vector by fitting the non-linear function between the noisy speech and the pure speech i-vector to achieve the purpose of reducing the influence of noise on the system performance.Experiments on the TIMIT dataset verify the feasibility and effectiveness of this method.
Keywords/Search Tags:speaker recognition, deep neural network, i-vector, PLDA, noise robust
PDF Full Text Request
Related items