Research And Application Of Speaker Recognition Technology Based On Deep Learning

Posted on:2022-04-28

Degree:Master

Type:Thesis

Country:China

Candidate:Z Chen

Full Text:PDF

GTID:2558306914462284

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

Speaker recognition is a biometric technology,which identifies the identity of a speaker based on the personality information in the speaker’s voice signal.Because of the low cost of collection and high user acceptance,speaker recognition has been widely used in fields such as mobile payment and community security.With the continuous development of science and technology,speaker recognition technology based on deep learning has achieved impressive performance.However,factors such as environmental noise,channel mismatch and emotional state of the speaker still restrict the performance of speaker recognition systems to some extent.The goal of this thesis is to apply deep learning technology to design a robust speaker recognition system,and improve the recognition performance in complex environments.The speaker recognition systems proposed in this thesis are tested in noisy,cross channel and cross language scenarios,and the results show that the proposed system is robust in a variety of different working environments.The main contents of this thesis are as follows:1.Propose a speaker recognition method based on a two-dimensional convolutional neural network,and innovatively integrate channel or spatial attention mechanism module into the speaker’s frame-level feature extraction module.At the same time,the GhostVLAD method is applied in the speaker’s utterance-level feature aggregation module,which also works on the channel dimension of features,and can be combined with the attention mechanism in the frame-level feature extraction module to further improve the performance of the speaker recognition system.2.Based on the existing one-dimensional convolutional time-delay deep neural network model ECAPA-TDNN,a variety of attention mechanism modules are also integrated with it,and the prototypical network loss function is applied in the optimization stage of the neural network to improve the performance of the speaker recognition system.3.Considering the decisive effect of the model training process on the performance of the speaker recognition system,this thesis systematically studies the loss function based on classification and the loss function based on metric learning,which are commonly used in the field of speaker recognition,and summarizes the influence of these two kinds of loss functions on the performance of speaker recognition system through experimental comparison,and analyzes the reasons.

Keywords/Search Tags:

speaker recognition, deep learning, attention mechanism, loss function

PDF Full Text Request

Related items

1	Research And Implementation Of Speaker Recognition Technology In Complex Scenes
2	Research On Deep Learning Based Speaker Recognition Algorithm
3	Research And Application Of Speaker Recognition Based On Deep Learning
4	Study On Speaker Recognition Based On Deep Learning
5	Design And Research Of Speaker Recognition System Based On Speech Enhancement
6	Research On Loss Functions In Neural Networks For Speaker Recognition
7	Research On Key Technologies Of Speaker Recognition Based On Deep Learning
8	The Application Of Speaker Recognition Technology Based On Deep Learning
9	Speaker Recognition Method Based On Deep Learning
10	Text Independent Speaker Recognition Based On Deep Learning Framework