Font Size: a A A

Research On Speaker Recognition Based On Vector Quantization (VQ)

Posted on:2016-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ZhangFull Text:PDF
GTID:2208330461978136Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Speaker recognition technology is an important research branch in speech recognition field. Its identification is based on the feature parameters, which are extracted to effectively reflect the personality of people. Speaker recognition process includes pre-processing of the speech signal, feature extraction, modeling and model matching. This paper made the following research:In terms of speech enhancement, due to the influence of noise on the performance of speaker recognition system, FastICA based on the negative entropy was highlighted. In this paper, FastICA was combined with STSA-MMSE and was used in front-end speech enhancement. Experimental results show that the effect of speech enhancement is obvious.In endpoint detection stage, the paper studied the traditional endpoint detection which is based on dual threshold and cepstrum distance. Borrowing idea from this method, an improved cepstrum distance endpoint detection algorithm was proposed. The comparative experiments show that the improved method can achieve better results.In feature extraction stage, the cepstral feature and pitch of speech signal were combined as feature parameters in speaker recognition system. However, if these feature parameters were superimposed directly, the amount of calculation will increases, thereby both the training and the recognition time will also increase. So Fisher criterion was used to select feature dimensions. First, the Fisher criterion ratio corresponding to each dimension of feature parameters was calculated. Then, several dimensions of each feature, which correspond to several leading biggest Fisher criterion ratios, was selected. Among these combined features, one group which can achieve the best recognition performance was adopted. Experimental results show that combined features, which were selected by Fisher criterion, can remove redundancy and further improve the performance of speaker system.The Vector Quantization Model was introduced in details. Traditional LBG algorithm would be sensitive to outliers, impulse noise and salt & pepper noise during VQ codebook generation process. Besides, it simply replaces the entire cell cavity with the mean point, resulting in blurred boundaries between cell cavities. Therefore, this paper replaced the entire cell cavity with a real point, which is the nearest point away from the center of the cell cavity. Experimental results show that the improved LBG method can effectively solve problems mentioned above.
Keywords/Search Tags:Speaker Recognition, Speech Enhancement, Endpoint Detection, MFCC, Vector Quantization Model
PDF Full Text Request
Related items