Font Size: a A A

Speaker Recognition Based On SOINN And Incremental Gaussian Mixture Model

Posted on:2014-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z L TangFull Text:PDF
GTID:2248330395995514Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Gaussian mixture model (GMM) has been widely and successfully used in speak-er recognition during the last decades. To deal with the dynamic growth of data sets, initialization problem of GMM and obtaining high recognition result depend on smal-1amount of training data, an incremental and adaptation method called incremental learning Gaussian mixture model (IGMM) is proposed in this paper. It is applied to speaker recognition system based on SOINN (Self Organization Incremental Learning Neural Network) and improved EM algorithm. SOINN is a Neural Network which can give a suitable mixture number and appropriate initial cluster for each model. First, the initial training was conducted by SOINN and EM algorithm to form a initial GM-M model. The model will adapt to the data available in each session to enrich itself incrementally and recursively. Experiments are taken on the first Speech Separation Challenge database. The results show that IGMM outperforms GMM in most of the cases.In system, we implement a complete speaker recognition system, including voice pre-processing, feature extraction, training, incremental learning and recognition mod-ule. VAD method is used in the pre-processing stage to remove mute segment. MFCC is used as the characteristic coefficient in feature extraction, self-organizing incremen-tal learning neural network and Gaussian mixture model are used as the model methods in the training phase.In this paper, we have made some contributions to self-adaptivity, incremental learning and robustness of speaker model:(1)A voice activity detection method is used in the pre-processing stage to remove the mute voice segment in speech, leaving only pure voice segment for the system to train and identification. It can effectively improve the accuracy and robustness of the voice characteristics, providing effective guarantee for a high recognition rate of the system, we use dynamic coefficient method in Mel cepstral coefficient because it is a anti-noise means in coefficient level.(2)In the training stage, the usage of self-organizing incremental learning neural network instead of K-means method provide better adaptability and accuracy to the model, and overcome the disadvantage of K-means method which need preliminary definition of mixture number, making the system more adaptive and promotion.(3)When training is complete, incremental learning module provides an adaptive incremental learning method which can better meet system incremental learning re-quirements, making our system has the ability to deal with dynamic growth data set.
Keywords/Search Tags:Incremental Learning, Speaker Recognition, GMM, SOINN
PDF Full Text Request
Related items