Font Size: a A A

Digital Waveguide Model And Its Application In Speaker Identification

Posted on:2012-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y YanFull Text:PDF
GTID:2218330368492565Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Digital waveguide is an acoustic model for the precise description of human's speech organs. It is first used in the field of speech synthesis. By using the digital waveguide, people are able to get more natural speech.It is a hot spot and the difficulty of the current research to estimate the waveguide parameters from speech signals. The waveguide parameters include the content of speech, personality of the speaker, etc. Therefore, the precise estimation of the parameters from speech signals plays a significant role in the aspects of speech recognition and speaker identification. However, there are still many problems in the current research of parameters. For instance, camera may be needed to aid the experiment and some false conditions may be hypothesized. With the application of MRI in recent years, researchers can have a clearer observation of human throat and vocal tract during the articulation process. But it is difficult to operate in reality. To solve the above problems, this thesis has done the following work.First, the thesis gives a detailed introduction to the 1-D digital waveguide network on the basis of acoustic model. It makes improvement on the traditional fixed-length vocal tract and obtains the speech from the flexible-length vocal tract, which enriches the speech synthesis theory.Second, the thesis introduces the concept of GVTF and compares it with the traditional VTF. It presents an interpretation algorithm to extract GVTF,VTF and glottal waveform from vowels and compares it with the LPC parameters. The thesis proposes the method of using computer to extract GVTF parameter automatically for the first time. It analyzes the feasibility of this parameter in identifying the speaker. The experiment suggests that GVTF parameter can better reflect the personality of a speaker.Third, the thesis proves the validity of the parameters extracted. For the first time the GVTF parameters extracted are used in the speaker identification. Under the same GMM identification model, the thesis compares it with the traditional MFCC parameters. The experiment shows that along with the communication channel change, GVTF parameters generate higher identification rate than the MFCC parameters.
Keywords/Search Tags:digital waveguide models, speaker identification, glottal waveform, vocal tract filters, Gaussian mixture models
PDF Full Text Request
Related items