Research And Implementation Of Speaker Recognition Based On Deep Learning

Posted on:2020-10-30

Degree:Master

Type:Thesis

Country:China

Candidate:N Yang

Full Text:PDF

GTID:2428330575453111

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of deep learning technology,the application of artificial intelligence is becoming more and more civilian.As one of the tools of human cognition of the world,sound has been fully researched and developed in the intelligent day.In recent years,with the popularity of smart mobile devices,more and more voice data has been collected,pushing people to do valuable things with the data.With the support of big data,traditional statistical methods are still used for speaker recognition,but there are some limitations.For example,in order to achieve better results,more precise feature extraction of complex data is needed,so it is urgent to develop a new and more effective method.Deep learning technology is naturally suitable for large amounts of data and has a mature application in the fields of computer vision and natural language processing.Therefore,this paper implements the speaker recognition algorithm based on deep learning technology to achieve the purpose of identifying the identity,age and gender of the speaker.The main work of this paper is:1)A closed set text independent speaker identification algorithm based on speech spectrogram is proposed.According to the basic requirement that the number of speakers to be recognized remains the same,this paper abstracts it into a classification problem,takes speech spectrogram as input feature,trains Convolutional Neural Network(CNN)as multi-classification discrimination model,and realizes the identification of the speaker's identity.Compared with the traditional Mel-frequency cepstral coefficient(MFCC)and Gaussian Mixture Model-Universal Background Model(GMM-UBM)based methods,this algorithm proves that the proposed algorithm has higher recognition accuracy and less computational delay on large public datasets.2)An open set text independent speaker identification algorithm based on identity coding is proposed.The difference between speaker identification in open set and closed set is studied.The problem that the number of speakers under open set is not fixed is based on the closed set text-independent speaker identification algorithm based on the spectrogram.A good multi-classification neural network is used as a feature extractor to identify different speakers for identity recognition.Compared with the traditional method,when the number of registered voices per person is small,the performance of the method is more stable and the recognition accuracy is higher.3)In order to meet the needs of speaker age and gender recognition,continue to use graph features and neural network methods.In the graph features,try spectrogram,Log-Mel Energies,MFCC,Constant-Q-Transform(CQT)and Harmonic Percussive Source Separation(HPSS),and add Recurrent Neural Network(RNN)to the model.A comparative experiment was performed on the non-public dataset,and combined with the time complexity of the algorithm operation,the better performance of the Log-Mel Energies was selected as the input feature,and the Http service was built to realize the age and gender recognition.This feature has been embedded in the Kings of Glory intelligent robot products sold by Tencent.

Keywords/Search Tags:

Artificial Intelligence, Deep Learning, Speaker Recognition, Neural Network, Harmonic/Percussive Source Separation

PDF Full Text Request

Related items

1	Research On Acoustic Scene Classification Using Deep Learning
2	Research On Feature Extraction And Recognition Of Sound Event
3	Short Speech Speaker Recognition Method Based On Deep Learning And Its Application In Speech Separation
4	Design And Research In Speaker Recognition System
5	Research On The Separartion Algorithm Of Music Instruments And Singing Vioce
6	Research On Sound Source Separation Algorithm Based On Deep Neural Network
7	Research On Separation Of Audio Signal Based On Deep Neural Network
8	Speaker Recognition Based On Swarm Intelligence And Blind Source Separation
9	Speaker-Independent Single-Channel Speech Separation Based On Deep Learning
10	Research On Speaker Recognition Based On Deep Learning