Research On Speaker Recognition Based On Deep Neural Network

Posted on:2021-04-14

Degree:Master

Type:Thesis

Country:China

Candidate:C Y Sun

Full Text:PDF

GTID:2428330614456382

Subject:Bionic Equipment and Control Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,voice recognition technology has been continuously developed and more and more applications have been applied.Speaker recognition technology has also received a lot of attention as an important identity authentication method.The researchers used deep learning for speaker recognition and achieved remarkable research results.The main purpose of this article is to improve the speaker recognition rate in closed sets that are not related to text.Based on the deep neural network,we will study speaker recognition.A large number of experiments have proved that for speaker recognition based on deep learning,the advantages and disadvantages of the speaker's feature parameters and acoustic models seriously affect the quality of the recognition system,so the main work of this article is to preprocess the feature parameter The window function is improved and the existing acoustic model is optimized for training and testing.Experiments show that the speaker recognition accuracy of the improved speaker recognition system has been effectively improved,which proves that the method used in this paper is valuable and has reference significance for future research work.This article first introduces the overall framework of speaker recognition,as well as the specific extraction process of three feature parameters often used in speaker recognition and compares their advantages and disadvantages.By analyzing the process of extracting MFCC coefficients,in order to make the feature parameters contain more speaker voice information,the key step of speech windowing,that is,the Hamming window used is proposed,and the mathematical analysis proves that the newly designed window The main significance of the function for extracting voice MFCC feature parameters on the basis of the original Hamming window is to increase the feature information such as the slope and phase of the voice power spectrum.Experiments show that the improved voice feature parameters can effectively improve the efficiency of later training and thus improve the accuracy of speaker recognition.Then,the shortcomings of the gated recurrent unit neural network are analyzed,and a deep bidirectional gated recurrent unit(BiGRUs)neural network is proposed as an acoustic model for speaker recognition.In order to solve the problem of gradient disappearance and overfitting in BiGRUs,this paper combines Maxout network and Dropout regularization algorithm to improve the BiGRUs acoustic model,and proposes the BiGRUs-DM acoustic model.Experimental results show that the improved BiGRUs-DM speaker recognition model in this paper is superior to BiGRUs,Bi LSTMs and other models,which can effectively improve speaker recognition performance.Finally,the improved speaker recognition system will be tested and analyzed in the THCHS-30 Chinese corpus and the self-made corpus in this paper.Experimental results show that the speaker recognition system established in this paper has stronger generalization ability and higher recognition rate than the traditional speaker recognition system based on RNN.

Keywords/Search Tags:

Speaker Recognition, Deep Neural Network, Recurrent Neural Network, MFCC, BiGRUs, Maxout, Dropout

PDF Full Text Request

Related items

1	Research On Speaker Recognition Method Based On Fuzzy Neural Network
2	Research On Speaker Recognition Based On Deep Neural Network
3	Research On Identity Recognition Algorithm Based On Speech Features
4	Research On Deep Learning Methods For Use With Speaker Recognition
5	The Study Of Speaker Recognition System Based On MFCC
6	Study On Human Action Recognition In Videos Based On Convolutional Neural Network
7	Application Of Deep Recurrent Neural Networks In Speaker Recognition On Mobile Phones
8	Comparative Research On The Recognition Effect Of Wake-Up Words Based On Deep Learning
9	Speaker Recognition Research Based On Clustering Analysis And Neural Network Ensemble
10	Research On Sensor Activity Recognition Based On Improved Deep Recurrent Neural Network