Font Size: a A A

Speaker Recognition Based On EMD

Posted on:2011-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y L LiuFull Text:PDF
GTID:2218330338977148Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
During the biometric systems, speaker recognition has been becoming a prominent recognizing way, based on its convenience, economy and accuracy. Currently, speaker recognition has been widely applied to electronic business, helpdesks, forensics, telephone banking and etc. Speaker features are basic to speaker recognition system. Nowadays, most of studies extract speaker features by short-time analysis method, or Fourier transform. However, speech signal is of typical non-linear signal. In fact, using linear-signal analysis to extract speaker features cannot avoid ignoring some important information. Facing this situation, this paper has done a series of studies, the main work are as follows:First, this paper improves the present features. This paper applies the psychologically weighted technology in mel-cepstrum analysis and adopted the Signal-to-Mask Ratios (SMRS) obtained from psychoacoustic model as weighting function to acquire the weighted mel-cepstrum coefficients (WMCEP).Secondly, this paper introduces the non-linear signal analysis, or Hilbert-Huang Transform (HHT), which is composed of Empirical Mode Decomposition (EMD) and Hilbert Spectral Analysis (HSA). The EMD, together with the short-time analysis, is used to analyze speech signals to extract three kinds of speaker features. In the experiments, SVM model is applied to speaker recognition. On the stage of training, it builds speaker models, while on the stage of predicting, it compares the features with speaker models built during training. In order to express SVM's classification abilities, this paper also uses GMM as comparison model.Thirdly, from the perspective of theoretical analysis, this paper analyzes the feasibility and effectiveness of the method, which is to extract speaker features by combination of EMD and short-time analysis. The analysis methods are based on two theories; one is the HSA (Hilbert Spectral Analysis) spectrum and marginal spectrum, while the other is residual phase.As a new try, this paper applies the EMD to propose new ways to extract speaker features, which has certain theoretical basis and practical effect. Most importantly, it is good for the future study of auto speech recognition and speaker recognition.
Keywords/Search Tags:speaker recognition, Hilbert-Huang Transform, Empirical Mode Decomposition, Hilbert Spectral Analysis, marginal spectrum, WMCEP, SVM, GMM
PDF Full Text Request
Related items