Speaker Recognition Based On Nonlinear Dynamics And Information Fusion

Posted on:2006-06-21

Degree:Doctor

Type:Dissertation

Country:China

Candidate:L M Hou

Full Text:PDF

GTID:1118360185488027

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

Speaker recognition, one of biometric identification technology, aims to identify the ID of the speaker by his/her utterance. It is a greatly promising technology to be applied to security access of information service, forensics purpose, speaker tracking and intellectualized human-machine interface, etc. Speaker recognition is implemented mainly through three phases: feature extracting, model building, decision. Feature extracting is the first and important phase in the whole recognition course. If there are no effective features, the optimization of the two latter phases is difficult to be efficient. The methods of feature extraction widely used are usually based on an assumption that short-time speech signal is stable. These features perform well in speaker recognition; however, they show their limitation on further improving the correction rate and the robustness of speaker recognition system. In light of the difficulty, the nonlinear feature of speech herein is investigated and it is discussed how to optimize the performance of the speaker recognition system. Some effort involving this subject is made as follows:The vocal organs are depicted and the mechanism of the phonation is expatiated in order to originally investigate the essence of human speech. Then nonlinear phenomenon is found to exist in the course of phonation. Recontructed phase space of speech by estimating delay time on autocorrelation functionAnd and estimating embed dimension on false nearest neighbors methods, the chaos in speech is experimentally approved by calculating the maximum Lyapunov exponent of 38 mandarin phonemes being positive.Study fractal dimensions algorithm of speech. The GP correlation integral algorithm by Grassberger and Procaccia proposed is used for calculating correlation dimension and Kolomogorov entropy of speech signals, moreover Using the dimension D2 derived from the correlation integral, the generalized dimension Dq of an arbitrary order q is calculated.As provide more information than individual dimensions, the generalized dimensions of speech signals are used for speaker idenfication. Decorrelation of the generalized dimensions is carried out and the Mahalanbis distances are applied to discriminance in text-independent experiments in 48 speakers. The differential generalized dimensions added for identify features with generalized dimensions, accuracy rate is increased. Experimental results have indicated the usefulness of fractal dimensions in characterizing speaker's identity.The generalized dimension features can achieve higher accuracy with frame-length...

Keywords/Search Tags:

speaker recognition, nonlinear feature, Lyapunov exponent, entropy, fractal dimension, information fusion

PDF Full Text Request

Related items

1	Research On Speaker Recognition In Noisy Environment
2	Research On Feature Extraction Algorithm In Speaker Recognition
3	Research On Underwater Target Recognition
4	Continuous Natural Voice Text-independent Speaker Recognition And Dsp-based
5	Multi-speaker Recognition Based On Audio-video Feature Fusion In Smart Environment
6	Research On Leaves Recognition Based On Fractal Dimension
7	Research Of Least Square Acoustic Impedance Inversion Arithmetic
8	Experimental Study Of The Correlation Dimension And Lyapunov Exponent Of System Vibration Transducer. Bonding Process
9	Handprint Recognition Based On Cpd And Feature-level Fusion
10	The Research And Application Of Text-Independent Speaker Recognition Technology