Font Size: a A A

The Key Technology Research Of Speaker Recognition In Practical Environment

Posted on:2018-07-01Degree:MasterType:Thesis
Country:ChinaCandidate:X LiFull Text:PDF
GTID:2428330596989192Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Speaker Recognition is an important branch of speech signal processing,which mainly means to verify the identity of a specific person by his or her speech feature.It belongs to the category of biometric features recognition such as fingerprint,face and iris recognition.Because of its convenience and non-contact trait,this technology has wide application foreground in the field of network security,judicial identification,military encryption and so on.This technology has been developed since the 40's in last century,and now the performance of speaker recognition system has been greatly improved with the development of machine learning,experiencing from academic research to practical application.However,the background noise existing in the practical environment is inevitable in speech processing.Especially in mismatch condition,the accuracy of recognition deteriorates dramatically,which leads to limitations on the application condition of speaker recognition.This thesis focus on the research of speaker recognition technology in noise mismatch environment.On the basis of the state-of-art speaker recognition system,we use front-end speech enhancement and back-end model compensation methods to improve the system performance to meet the demands of practical application.The main research contents are as follows:First,we discuss the text-independent speaker recognition baseline system,which consist of two main parts,feature extraction and modeling.In this thesis,we use Mel-Frequency Cepstrum Coefficient as speech feature and GMM-UBM as basic model.Then we extract I-vector from the basic model by factor analysis to establish speaker recognition system.In the meantime,we showed that the recognition accuracy of baseline system significantly deteriorated in noise mismatch environment according to experiments.Secondly,we attempted to add speech enhancement module to improve the system performance.We mainly used the speech enhancement method based on noise estimation,and respectively introduced IMCRA,MMSE-BC and MMSE-SPP three noise estimation methods.Then we analyzed their estimation performances of various kinds of non-stationary noise and their improvement on recognition accuracy as front-end process.After that,we introduced a speaker model compensation algorithm,Probabilistic Linear Discriminant Analysis,and its usage in noise mismatch speaker recognition system.Then we adjusted the composition of development corpus and analyzed the anti-noise performance of system in different situations.Finally,we designed two different classification PLDA system to obtain the best performance in practical condition.
Keywords/Search Tags:speaker recognition, speech enhancement, Gaussian mixture model, ?-vector, probabilistic linear discriminant analysis
PDF Full Text Request
Related items