Font Size: a A A

The Research Of Speaker Recognition Based On Mutual Information Theory

Posted on:2005-11-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y B YuFull Text:PDF
GTID:1118360155960309Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Speaker recognition as one of biometric identification research aims to identify living persons from their voice. It is useful in person authentication, forensics and speaker tracking, etc. Many scientists and engineers have contributed their wisdom and enthusiasm in this challenge research, but still there are many problems such as speaker model optimization and adaptation, feature selection and detection, pattern measure and matching left for further study. This thesis proposes a new approach based on mutual information theory to investigate the speaker recognition problem. The most attention focus on mutual information estimation of speech signals, speaker model and pattern matching scheme, performance evaluation and analysis with comparison to Gaussian based method. The main research work and achievements are as following.The previous work and results in speaker recognition research and its fundamental principle are introduced with discussion and analysis. Based on mutual information theory and analysis of statistical distribution and stochastic property of speech signal, the mutual estimation method was derived by defining a random interference signal to describe the distortion between speech signals. Two practical calculation algorithms were proposed as Linear Projection Matching (PLM) algorithm and Non-Linear search Matching (NLM) algorithm. Both time-varying and statistical distribution features can be well processed by these algorithms, and it make proposed method more meticulous and robust than traditional VQ and GMM methods which did not take process of neither one of the two features.Speaker models named as multi-template model (MTM) and complete feature corpus model (CFC) were proposed respectively for text-dependent speaker recognition and text-independent speaker recognition. MTM represents central templates of a speaker's text-dependent voice in the pattern space, CFC is designed as an adequate description of speaker's phonetic and pronunciation properties and practically trained by a clustering algorithm in feature vector space with sufficient samples.Text-independent speaker recognition scheme is an integration of CFC and a matching algorithm as Multi-step Mini-max Search algorithm (MMS). MMS algorithm makes the input speech and CFC speaker model sequentially match in distance space and information space with minimum distance and maximum mutual information...
Keywords/Search Tags:Speaker recognition, Mutual information, Matching, Linguistic property, Individual property
PDF Full Text Request
Related items