Speaker Recognition System Based On Deep Learning

Posted on:2019-12-16

Degree:Master

Type:Thesis

Country:China

Candidate:J D Zhang

Full Text:PDF

GTID:2428330545964167

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

With the continuous development of speech recognition technology,speaker recognition has received more and more attention as an important method of identity authentication.Speaker recognition,also known as voiceprint recognition,identifies the speaker by extracting features that characterize the speaker's identity from the speech signal.As a biometric authentication technology,speaker recognition has important research value and broad research prospects.As the speech recognition technology has made great progress under the influence of deep learning,the speaker recognition technology is deeply affected.More and more researchers have shifted their research from traditional methods based on probability statistics to deep learning methods.Inspired by the end-to-end model,this paper uses deep neural network to extract the deep features of speakers,and improve the network under the condition of less training data.Establishing a speaker recognition system using time-delayed neural networks and PLDA back-end models.The improved network is composed of 8 hidden layers and a layer of pooling layer.The pooling layer aggregates the output of the preceding hidden layer over time and computes its average and standard deviation And accumulate these statistics as input for the next hidden layer.Use the output of the last hidden layer of the trained network as a speaker feature during speaker enrollment.In the test phase,extract the same characterization vector and average.and then use the PLDA model to score.Instead of using a single frame MFCC feature,splicing a feature at a certain step size as a network input to achieve long-term speech feature.Finally,compare the improved network model with the traditional i-vector method,EER reduction of 2.4% in noise datasets.In the gender-related test,the EER value decreased by0.8% in the female test data set.In the test data set including Chinese,the EER value decreased by 13.8%.

Keywords/Search Tags:

Speaker recognition, TDNN, i-vector, MFCC, PLDA

PDF Full Text Request

Related items

1	Speaker Recognition With Emotional Speech
2	Research On Speaker Recognition Over Short Utterance And Varying Channels
3	Research On Algorithms For Speaker Recognition
4	Research On Robustness Of Speaker Recognition In Noisy Environment
5	Research On Speaker Recognition In Noisy Environment
6	Study On Speaker Recognition Technology
7	Research On Speaker Recognition Based On Vector Quantization (VQ)
8	Research And Implementation Of Speaker Recognition System Based On VQ And HMM
9	The Research Of Speaker Recognition Algorithms Based On MFC And Vector Quantization
10	I-vector Normalized Method Based Probabilistic Linear Discrimination Analysis For Speaker Verification Research