Font Size: a A A

Research On Text-independent Speaker Verification Based On Deep Learning

Posted on:2017-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:M H WuFull Text:PDF
GTID:2308330485954831Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
The telephone has become an indispensable part of people’s daily life with the continuous development of Internet technology. Speaker recognition has attracted more and more attention. Speaker recognition required equipment is simple and cheap, and its authentication method is non-contact interaction. Compared with other identity authen-tication technology, the speaker recognition technology will be applied more widely. The speaker recognition is an important research direction in speech signal processing. Therefore, researches on speaker recognition possesses important theoretical signifi-cance and practical values.The probability model is the most popular speaker model in the field of speak-er verification research. One of the most representative of the statistical probability speaker model comes Gaussian mixture model. GMM better describes the speech fea-ture distribution, so it has good performance. However, probability statistical model cannot accurately describe the boundary for the speaker verification, which is a clas-sical binary classification problem. We need to find the speaker model with powerful representation and discriminative descriptive power. In recent years, deep learning has become a hot research question. It simulates the human brain thinking by building a network, which is named deep neural network. DNN has powerful representation by stacking many traditional neural network, and it has made many successful applications in pattern recognition research. The main research content of this paper is to build a s-peaker verification system based on DNN. We investigate the selection of input feature and the network structure. In addition, we will compare their performance in speaker verification.First, we will introduce the Gaussian mixture model and explore how it is applied to speaker verification. Currently, speaker verification based on GMM-UBM is the mainstream methods. We also introduce the MAP algorithm which is used in the speaker model training, and analyze with experiments the performance of GMM-UBM with different mixtures. In this paper, the baseline system is based on GMM-UBM.Next, we discuss the neural network, and introduce the history of deep neural net-work. In addition, we focus on the problem in deep neural network training and pro-pose solutions. Because its representation power is very poor in the speaker verifica-tion system based on GMM-UBM, we construct a speaker verification system based on DNN-SPK. In order to reduce the semantic information in the speech signal, we extract statistical parameters from the original features by Gaussian mixture model. We con-struct a speaker verification system based on GMM-DNN, that significantly improve performance.Finally, because the DNN model can’t process the long history signal, we introduce the LSTM model which can remember the very long history information and construct a speaker verification system based on LSTM-SPK. We investigate the selection of input feature and output label. Due to the computational complexity of LSTM is very high, we construct a speaker verification system based on LSTMP, which add a project layer to LSTM model. It makes the performance of speaker verification system to achieve a significant improvement that based on LSTMP-SPK.
Keywords/Search Tags:Speaker Verification, Gaussian Mixture Model, Deep Learning, Deep Neu- ral Network, Long Short Term Memory
PDF Full Text Request
Related items