Speaker Recognition Algorithm Based On Deep Learning

Posted on:2021-05-31

Degree:Master

Type:Thesis

Country:China

Candidate:L Zheng

Full Text:PDF

GTID:2428330602478820

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

Speaker recognition,also known as voiceprint recognition,which is a technology to judge speaker identity based on voiceprint characteristics.Speaker recognition is widely used in various fields and has practical research value.With the improvement of computer hardware performance,voiceprint recognition technology based on deep learning has become one of the mainstream methods.However,in deep learning tasks,it is often to learn a single speaker classifier model to predict labels,or use a simple similarity decision method to achieve model matching,which leads to insufficient discriminative ability of the voiceprint features finally trained.In this paper,in order to extract the voiceprint features with strong discriminating ability,by improving the traditional loss function,the network model trained by the improved loss function supervision can effectively improve the speaker recognition accuracy.The content of this article is as follows:1.First,the low-dimensional features of the speaker are extracted through the last hidden layer of the dense network(DenseNet),and then the proposed ICTL loss function is used as the target function of the last hidden layer of DenseNet,ICTL is composed of triplet loss and improved triplet loss(ICL),it is responsible for calculating the similarity loss between the triplet features extracted in the last hidden layer,then use Softmax Loss to calculate the error loss between the predicted identity and the true identity of the triplet sample corresponding to the last classification layer of DenseNet,where ICTL is the auxiliary loss functions of Softmax Loss,through the supervision of ICTL,The dimensions of the voiceprint features output by the last hidden layer have a highly correlated distribution,that is,the same speaker samples are close to each other,and the different speaker samples are far away from each other,when the sample features of the triplet pass through the last classification layer of DenseNet,the speaker recognition effect will be greatly improved.2.DenseNet is still used as the voiceprint feature extractor,and extract the voiceprint features of the last hidden layer.Introduced the idea of Triplet Center Loss(TCL),and improve it on the basis of TCL,proposed two TCLs with added intra-class constraints as the supervision function of the last hidden layer of DenseNet,in order to further enhance the constraint of the similarity between the extracted voiceprint features and the feature center of the sample samples belonging to the same speaker during the training process,the discrimination ability of the voiceprint features is improved,and the recognition effect of the DenseNet classification layer is improved.

Keywords/Search Tags:

Speaker recognition, Dense network, Triplet loss, Triplet center loss

PDF Full Text Request

Related items

1	Triplet Loss And Manifold Dimensionality Reduction Based Method For Text-independent Speaker Recognition
2	Research On Face Recognition Based On Machine Learning Method
3	Speaker Recognition Based On UBM And Deep Learning
4	Research On Voiceprint Recognition Model Based On End-to-end Neural Network
5	The Application Of Speaker Recognition Technology Based On Deep Learning
6	Research And Implementation On Flower Image Recognition Based On Deep Learning
7	Research On Speaker Recognition Algorithm Based On Deep Convolutional Neural Network
8	Research And Implementation Of Face Recognition Based On Triplet-awared Center Loss
9	Content-independent Speaker Verification Modeland Its Application
10	Research On Deep Learning Based Speaker Recognition Algorithm