Font Size: a A A

Speaker Recognition Based On Nonlinear Dynamics And Information Fusion

Posted on:2006-06-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:L M HouFull Text:PDF
GTID:1118360185488027Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Speaker recognition, one of biometric identification technology, aims to identify the ID of the speaker by his/her utterance. It is a greatly promising technology to be applied to security access of information service, forensics purpose, speaker tracking and intellectualized human-machine interface, etc. Speaker recognition is implemented mainly through three phases: feature extracting, model building, decision. Feature extracting is the first and important phase in the whole recognition course. If there are no effective features, the optimization of the two latter phases is difficult to be efficient. The methods of feature extraction widely used are usually based on an assumption that short-time speech signal is stable. These features perform well in speaker recognition; however, they show their limitation on further improving the correction rate and the robustness of speaker recognition system. In light of the difficulty, the nonlinear feature of speech herein is investigated and it is discussed how to optimize the performance of the speaker recognition system. Some effort involving this subject is made as follows:The vocal organs are depicted and the mechanism of the phonation is expatiated in order to originally investigate the essence of human speech. Then nonlinear phenomenon is found to exist in the course of phonation. Recontructed phase space of speech by estimating delay time on autocorrelation functionAnd and estimating embed dimension on false nearest neighbors methods, the chaos in speech is experimentally approved by calculating the maximum Lyapunov exponent of 38 mandarin phonemes being positive.Study fractal dimensions algorithm of speech. The GP correlation integral algorithm by Grassberger and Procaccia proposed is used for calculating correlation dimension and Kolomogorov entropy of speech signals, moreover Using the dimension D2 derived from the correlation integral, the generalized dimension Dq of an arbitrary order q is calculated.As provide more information than individual dimensions, the generalized dimensions of speech signals are used for speaker idenfication. Decorrelation of the generalized dimensions is carried out and the Mahalanbis distances are applied to discriminance in text-independent experiments in 48 speakers. The differential generalized dimensions added for identify features with generalized dimensions, accuracy rate is increased. Experimental results have indicated the usefulness of fractal dimensions in characterizing speaker's identity.The generalized dimension features can achieve higher accuracy with frame-length...
Keywords/Search Tags:speaker recognition, nonlinear feature, Lyapunov exponent, entropy, fractal dimension, information fusion
PDF Full Text Request
Related items