Font Size: a A A

Application Of Deep Learning And Supervector In Speaker Recognition

Posted on:2019-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:W Z LiFull Text:PDF
GTID:2428330545971517Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Speaker recognition technology(also known as voiceprint recognition technology),as with fingerprint recognition,face recognition and so on,is a branch of biometrics.Compared with several other biometric techniques,speaker recognition technology has unique advantages and disadvantages.In order to improve the accuracy of speaker recognition,Gauss supervector is introduced.Although Gauss supervector contains almost all the vectors of speakers,it also contains many worthless information.How to reduce dimension of Gauss supervector is the focus of this paper.Traditional linear dimensionality reduction algorithms,such as PCA and FA,are very common dimensionality reduction algorithms,because these algorithms are simple and efficient,so they are widely used.However,there is a problem when these linear dimensionality reduction algorithms reduce the dimension of the super vectors.That is,these linear dimensionality reduction algorithms will remove the nonlinear characteristics and retain only the linear features.When the depth learning algorithm deals with the data,it can retain the nonlinear characteristics of the data,so it can be applied to the speaker's reduced dimension.The main work and innovation of this paper are as follows:(1)In this paper,we study the traditional linear dimension reduction algorithm,and use the traditional linear dimension reduction algorithm to reduce the dimension of Gauss super vector,and apply it to the complete speaker recognition system.Speaker recognition system generally uses speaker recognition distance to judge its similarity.The distance between voice samples is more similar to the sample,the more similar the sample is.Different distance similarity measurement results are different.This paper uses PLDA to score the sample similarity.(2)The deep learning technology restricted Boltzmann machine is introduced in the speaker recognition,the use of deep learning with strong deep information extraction ability and the ability of nonlinear modeling,extract the I-vector better,and applied to speaker recognition system.
Keywords/Search Tags:Speaker recognition, SuperVector, dimensionality reduction, Deep Learning, Restricted Boltzmann Machines
PDF Full Text Request
Related items