Font Size: a A A

Research On Speaker Identification Method And System Application Development

Posted on:2005-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:H H XuFull Text:PDF
GTID:2168360152967569Subject:Engineering Mechanics
Abstract/Summary:PDF Full Text Request
In this thesis, the application of speaker identification is thoroughly discussed. They are Vector Quantization(VQ), Hidden Markov Models(HMM) and Gaussian Mixture Models(GMM) speaker identification respectively. In the laboratory environment, based on text-independent and close-set condition, the three methods have been tested by 26 speakers' recording voice and the experimental result is 100% correct rate. In the respect of feature extraction, 16-order LPC cepstrum, 12-order MEL frequency cepstrum and the mixed features of 12-order LPC cepstrum, pitch period and short-time normalized frame energy have been tested respectively and the result is as good as we expected. In the implementation of VQ speaker identification method, LBG clustering algorithm is realized by two different codeword-chosen means and their experimental result is much better than the one of random codeword-chosen, meanwhile, an optimized empty voronoi cell processing method is advanced and used. In the HMM, the realization is based on 5 state transition, 32-observation length, and continuous possibility density function. In addition, states, which are ergodic, are divided by the segmentation of the frame normalized energy. In the GMM, 32 mixed orthogonal Gaussian density functions have been used to realize the system objective and the LBG algorithm is used in the initialization process.To improve the response speed and performance of the system, short-time frame length are tested in the experiments, the result observed is that both of the response speed and performance are to a great extent affected. Generally, 10~30ms time range of frame length is regarded as an appropriate interval. Under the condition of 11.025KHz sampling rate, several different frame lengths around the range are used for experiment, as a result, the system has the best performance in the response speed and correct recognition rate when the frame length is equal to 46ms. The system is developed under the VC++6.0 environment and the voice sampling procedure is actualized by the lower wave processing function, so it is convenient to set the sampling rate and it can process the voice in real-time. In addition, database query function by voice is supported in the system. Although the system is set up on base of the close-set speaker identification, it has good expanding feasibility, so if only experiments concerned are carried out and program code is added, the realization of the open-set speaker identification is viable.
Keywords/Search Tags:Speaker Identification, VQ, HMM, GMM
PDF Full Text Request
Related items