Research On Voiceprint Recognitionof Multi Speaker Basedonneural Network

Posted on:2021-03-30

Degree:Master

Type:Thesis

Country:China

Candidate:M Gao

Full Text:PDF

GTID:2568306104463324

Subject:Engineering

Abstract/Summary:

With the development of deep learning,voiceprint recognition has the characteristics of simple and fast access to voice files,which is suitable for remote authentication.These advantages make voiceprint authentication more and more popular with the system and developers.In real life,the voice often has the noise that interferes with the communication between people or the speech of strangers,and then affects the communication between people.In this paper,on the basis of fully referring to the relevant literature,aiming at the problem that the existing voiceprint recognition can not recognize the data with interference speech very well,a bi-directional short-term memory neural network is proposed to separate the mixed speech,and then the convolution neural network is used to verify the feasibility of this separation method.This paper only studies from the following aspects:First of all,this paper briefly describes the development significance and background of this topic at this stage,and introduces three major watershed of voiceprint recognition at home and abroad,and makes a brief introduction to speaker separation and clustering technology.Secondly,speech data is preprocessed,and then speech separation is divided into two categories: one is the segmentation of conversational speech using Bayesian information criterion algorithm,the other is the separation of mixed speech.Based on the method of time-frequency masking,the research process of speech separation algorithm using bidirectional short-term memory neural network to map speech signal to feature space is described.Thirdly,the set mapped to feature space is divided into two categories by means of mean clustering and speech signal segments are synthesized.Calculation error and evaluation criteria,and design several groups of comparative experiments,analyze the impact of iteration times,learning rate and speaker gender on speech separation,verify the impact of parameters on its training error,select the optimal parameters to train speech separation,so as to minimize the error.Finally,the convolution neural network is used to train the original speech data,establish a model,and then the separated and clustered speech data are used for recognition verification,and the recognition rate is compared with the similar speech separation to verify the feasibility of the method proposed in this paper for the voiceprint recognition of multi-person speech.

Keywords/Search Tags:

Voiceprint recognition, speech separation and clustering, bidirectional long and short-term memory neural network, convolution neural network, time-frequency masking

Related items

1	Speech Enhancement Based On Optimized Full Convolution And Long-short Term Memory Network
2	Research On Mandarin Speech Recognition Technology Based On Deep Neural Network
3	Speaker Emotional State Recognition Based On Speech And Text Fusion
4	The Research Of Dimensional Speech Emotion Recognition Based On Neural Network And Fusion Features
5	Chinese Sign Language Recognition Based On Convolutional Network And Long Short Term Memory Network
6	Amdo Tibetan Speech Recognition Based On Deep Neural Network
7	Application Of Short Term And Long Term Memory Neural Network In Stock Trend Prediction
8	Identification Of Dynamic Parameters Of Manipulator Based On Bidirectional Long-Short-Time Neural Network
9	Long Short Term Memory Recurrent Neural Network Application To Handwritten Recognition
10	Research On Shared Bicycle Stock Prediction Based On Long-term And Short-term Memory Neural Network