Font Size: a A A

Research On Personality Characteristic Based Speaker Recognition In Noisy Environment

Posted on:2013-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:M M LiFull Text:PDF
GTID:2248330392450829Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Speaker recognition is one of the biometrics techniques, which recognizespeaker’s identity from its speech parameters which contains individual’sphysiological and behavioral characteristics. Speaker recognition has caught manyattentions in ID recognition for its particular advantage of economy, convenience,accuracy and not easy to lose and forge, so researches about speaker recognition alsohave got rapid development. At present, speaker recognition is implemented mostlybased on short time methods in quiet environment, lack of individuality characteristicsand robust phonetic features, and existing identification algorithms are notclassifying-based. To solve these problems, this paper carries out several researchand design works, the innovations as follows:1. The paper introduces Hilbert-Huang Transform(HHT), which consists oftwo-parts-Empirical Mode Decomposition(EMD)and Hilbert Spectral Analysis(HAS).EMD method is used to decompose the speech in endpoint detection and featureextraction stage to get IMF components that meet the requirements, which realizespeaker recognition for noisy speech.2. In endpoint detection, EMD method which can well process non-stationary andnonlinear signal is used for endpoint detection of noisy speech. First, using EMD todecompose speaker’s speech, removing the two former IMF components that largestaffected by noise, then IMF components that can well represent original speech areselected by mean and variance analysis to reconstruct the original signal, and finallyendpoint detection is implemented to the reconstructed signal by traditional doublethreshold method. This paper also studies the result of the proposed endpointdetection method with different SNRs.3. In feature extraction, the IMF components that contain formants and pitchinformation are selected after EMD(IMF1,IMF2,IMF4contains formants info andIMF3contains pitch info), and then they’re reconstructed respectively for speakerrecognition, and MFCC coefficient and instantaneous frequency got via Hilberttransformation are used as their feature parameters.
Keywords/Search Tags:Speaker Recognition, TIMIT Corpus, Empirical ModeDecomposition, Support Vector Machine, Personal Characters
PDF Full Text Request
Related items