Speaker Recognition Based On Additive Margin Loss

Posted on:2020-11-20

Degree:Master

Type:Thesis

Country:China

Candidate:L Fan

Full Text:PDF

GTID:2428330575958252

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the development of deep learning,there is a paradigm shift of recent speaker recognition studies,from i-vector to deep neural networks.Meanwhile,the discrimi-nation of existing speaker recognition methods based on deep neural networks are not good enough.Deep neural networks provide good modeling capacity and training cri-terion is important for exploiting the power.This thesis analyses and compares the recent works on additive-margin based training criterion for deep learning and propos-es three new methods based on additive margin to increase accuracy or efficiency for three tasks in speaker recognition called speaker verification,speaker identification and speaker retrieval.1.Existing training criterion of classification cannot provide enough discrimi-nation for speaker verification.This article proposes Ensemble Additive Margin Em-bedding(EAME)for tackling the aforementioned problem.This work introduces en-semble embedding layers for speaker verification.EAME employs additive margin to improve the discrimination of the speaker verification model and ensemble embedding layers to enhance the robustness of the deep speaker representations.Experiments reveal that EAME can outperform existing speaker verification methods to achieve state-of-the-art performance.2.This article proposes Ensemble Additive Margin Classifier(EAMC)to improve discrimination of speaker identification models.This work introduces ensemble clas-sifiers and hard example mining for speaker identification.EAMC employs additive margin to improve the discrimination of the model and hard example mining to make the model focus on hard training samples.Ensemble classifiers in EAMC can improve the capacity of the model and smooth the change of difficulty during training proce-dure.Experiments reveal that EAMC can outperform existing speaker identification methods to achieve state-of-the-art performance.3.Existing hash-based speaker retrieval methods mostly adopt two-stage learning procedure which learns hash codes after extracting representations with i-vector.On the one hand,this two-stage learning procedure is harmful for hash learning.On the other hand,performance of the learnt hash codes is limited by i-vector.This article proposes Deep Additive Margin Hashing(DAMH)for tackling these problems.More specifically,DAMH employs deep hashing to enhance the feedback between feature learning and binary code learning.Further more,DAMH adopts additive margin to improve the discrimination of the learnt binary codes.As far as we know,DAMH is the first deep hashing method for speaker retrieval.Experiments reveal that DAMH can outperform existing speaker retrieval methods to achieve state-of-the-art performance.

Keywords/Search Tags:

speaker recognition, additive margin, ensemble strategy, hashing

PDF Full Text Request

Related items

1	End-to-End Speaker Embedding For Speaker Recognition In The Wild
2	Research On Feature Extraction And Model Algorithm For Speaker Recognition
3	Speaker Recognition Research Based On Clustering Analysis And Neural Network Ensemble
4	Research On Speaker Separation And Recognition Of Conference Voice Based On Voiceprint Recognition
5	Research And Application On Simultaneous Recognition Of Both Speech And Speaker
6	Research On Some Key Issues Of Speaker Recognition In Noisy Environment
7	Studies On Speaker Recognition Based On SVM And GMM
8	Speaker Recognition Based On Affinity Propagation Clustering And Ensemble Learning
9	Research On Speaker Recognition Based On Discriminative Feature Learning
10	Landmark Images Recognition With Deep Features