Font Size: a A A

Detection Based On Keywords Speaker Adaptation Research

Posted on:2005-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:J F LiuFull Text:PDF
GTID:2208360152965027Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Speaker independent speech recognition systems have achieved great progress in recent years. However the recognition performance degrades rapidly when there is a mismatch between the testing and the training conditions with the variance of speakers and environments. Therefore, adaptive techniques are critically important to overcome these mismatches between speakers and environments to propel speech recognition to practical applications.This paper took speaker adaptation as the object of research, which discussed speaker adaptation from speaker normalization, model parameter adaptation and speaker clustering, that is. from feature extraction, model adjustment and the theory of aggregate in the three different angles. Speaker normalization consists of cepstrum mean normalization (CMN) and vocal tract length normalization (VTLN). Experiments showed us that CMN was easy to perform, which not only could reduce the differences with different people, but also could remove the effect of various channels. The method of VTLN estimated the average third formant to compute the frequency warping factor, and made use of the linear transform, non-linear transform and bilinear transform completing the frequency warping. When females' features were normalized to males' features, the keyword spotting rate increased more than 12.59% with recognizing the data of females' in virtue of males' templates. About model parameter adaptation, this paper developed a structural adaptation algorithm with the combination Maximum a Posteriori (MAP) and Maximum Likelihood Linear Regression (MLLR), which was based on binary tree regression classes. This method made good use of the merits of MAP and MLLR. In addition, this paper introduced the method of speaker clustering based on GMM, which required less speech data but faster speed during the process of clustering. Besides, how to compute the distance of gauss mixture models was researched specially, and brought forward two new methods evaluating the model distance梞ixture weight distance and probability distance. The calculation of two kinds of methods was simple, but the experiment result was very good. During the implementation of keyword spotting system, this paper developed the robust module of speaker adaptation through uniting three method of speaker adaptation above. Furthermore, the system applied noise reduction, voice activity detection and the rejection method based on support vector machine into the system, which built up the robust of keyword spotting system further. At last, this paper gave conclusions and the research direction the next period.
Keywords/Search Tags:Speaker Adaptation, Speaker Normalization, Speaker Clustering, MAP, MLLR
PDF Full Text Request
Related items