Font Size: a A A

Research On Speaker Recognition Based On Deep Neural Network

Posted on:2021-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:F WuFull Text:PDF
GTID:2428330602976849Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Speaker recognition,also known as voiceprint recognition,is a technology of distinguishing speaker's identity based on speech.In recent years,with the rapid development of the Internet and the popularization of intelligent mobile devices,face recognition,fingerprint recognition,speaker recognition and other authentication technologies have a broad application market.The core of speaker recognition is to extract the information that can represent the speaker's identity from speech.Because of its powerful ability of information extraction and modeling,deep neural network has been widely used in computer vision,natural language processing and other fields.It is a research hotspot to introduce deep neural network into the field of speaker recognition.The main work of this paper is as follows:First of all,the speaker recognition system based on deep neural network is studied.The Mel frequency cepstrum coefficient is used as the feature parameter of speech,and the speaker recognition system based on deep short-term memory network(LSTM)is built as the baseline system.Secondly,the influence of the complexity of neural network on the performance of the system is studied.Because the change of the number of hidden layers and nodes will affect the recognition effect of the system,the recognition rate of the network structure with different layers and nodes is compared,and then the network structure is selected to optimize the recognition system.At last,the paper compares with the traditional speaker recognition system GMM-UBM in two aspects,and verifies that the recognition effect of LSTM based speaker recognition system is better than that of the traditional model no matter in the case of the change of speech length or the increase of the number of speakers.
Keywords/Search Tags:speaker recognition, deep neural network, MFCC, LSTM
PDF Full Text Request
Related items