The Research On Cross-lingual Speaker Recognition Based On Language-adversarial Training

Posted on:2019-03-24

Degree:Master

Type:Thesis

Country:China

Candidate:J Gao

Full Text:PDF

GTID:2428330563491553

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,With the wave of science and technology moving forward,people are paying more and more attention to information security and identity authentication in daily life.The leakage of personal information and confidential information not only poses a serious threat to the personal and property safety of individuals,it will also creat an adverse effect on the development of society.At present,the traditional identity authentication such as identity documents and passwords can no longer fully meet people's needs.Biometricbased identity authentication technology has attracted more and more attention due to its convenient and reliable features.As the speech is the most direct and convenient way for people to communicate in everyday life,the speaker recognition technology,which is derived from the speech signal processing technology,has become a hot topic and focus of research.With the advent globalization,in this era of multi-ethnicity,multi-nationality,and multi-culturalism,a single language has completely failed to meet the needs of people in daily life working,and learning.China has a vast territory and is a multi-ethnic country with rich ethnic minority languages and regional dialects.This is the case in dialects and ethnic languages in Cantonese,Tibetan,Uyghur,and other languages that differ greatly from Mandarin in southern and western China.Especially in the popular regions,cross-linguistic speaker recognition problems are particularly prominent in the fields of identity authentication,public security criminal investigation,and national defense security.In this thesis,a language-adversarial training based cross-lingual speaker recognition algorithm is proposed for cross-linguistic speaker recognition.The use of adversarial training in transfer learning improves the ability to extract speaker information from speech,thereby improving the accuracy of speaker recognition in cross-language tasks.The main work and contributions of this article include the following:1.In this thesis,the combination of convolutional neural network and time delay neural network is applied to the task of speaker recognition.We make use of the powerful expressive ability of deep neural network to construct an end-to-end neural network model to complete the speaker recognition task and verifies its validity on cross-linguistic speaker identification data.We verified its validity on cross-linguistic speaker recognition database.Experiments show that the convolution-time delay neural network can effectively extract the speaker information in speech,and the speaker recognition tasks in the same language and cross language can all be effectively used.2.In this thesis,a speaker recognition algorithm based on language-adversarial training is proposed,which used the idea of adversarial training in transfer learning,language information is added to the training of the end-to-end speaker recognition network and language-adversarial training methods is used to train the entire neural network.This algorithm inherits the characteristics of convolution-time delay neural network which can effectively extract speaker information from speech,at the same time,it can reduce the interference of language information in the feature extracted by neural network hidden layer,and effectively improve the accuracy of cross-lingual speaker recognition.3.The triplet loss function are used to train deep neural networks.The triplet loss function is used in place of the cross-entropy function in neural network training to consider different speaker information and different languages information into the neural network training.This method further improves the accuracy of cross-lingual speaker recognition.

Keywords/Search Tags:

Speaker Recognition, Cross-linguistic, Deep Neural Networks, Adversarial Training, Triplet Loss

PDF Full Text Request

Related items

1	Speaker Recognition Algorithm Based On Deep Learning
2	Triplet Loss And Manifold Dimensionality Reduction Based Method For Text-independent Speaker Recognition
3	The Application Of Speaker Recognition Technology Based On Deep Learning
4	Research On Speaker Recognition Algorithm Based On Deep Convolutional Neural Network
5	Research On Loss Functions In Neural Networks For Speaker Recognition
6	Speaker Recognition Based On UBM And Deep Learning
7	Research On Deep Learning Based Speaker Recognition Algorithm
8	Research On Voiceprint Recognition Model Based On End-to-end Neural Network
9	Research On Adversarial Training Methods Of Deep Neural Networks
10	Research On Cross-Modal Hashing Based On Deep Learning