Font Size: a A A

Research On Speaker Recognition Algorithm Based On Dictionary Learning

Posted on:2020-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LiFull Text:PDF
GTID:2428330590495816Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Speaker recognition is a technology for identifying the identity of an interlocutor by analyzing the characteristics of the speaker's voice with identity information.It is a research project that combines multiple disciplines,and it uses knowledge in different fields such as psychology,physiology,digital signal processing,pattern recognition,and artificial intelligence.It is used in various security fields,Internet applications and communication fields,call center fields,etc.,where identity authentication is required.At present,the technology of speaker recognition has become more and more mature,but in the real use environment,the speaker's anti-interference ability to noise,that is,the noise robustness problem is still a big problem,and it is also widely used.A huge obstacle.This article will study this issue.This paper analyzes the advantages and disadvantages of the classical GMM-UBM method and the ivector method in the context of speaker recognition and sparse decomposition technologies,and focuses on the application of sparse decomposition in the field of speaker recognition.The main purpose is to improve the speaker recognition system in noisy environments.Recognition rate,as well as reducing memory,increasing calculation speed,etc.First of all,this paper analyzes the basic principles of speech signal and speaker recognition.The acoustic generation mechanism of speech signal,signal preprocessing(endpoint detection,framing,windowing),extraction of common features such as MFCC,and judgment parameters of system recognition rate are all introduced in detail.In addition,the classic GMM-UBM system model and algorithm are also analyzed.Then,this paper analyzes the most popular i-vector feature framework in the industry,and describes and validates the conceptual principle and extraction method of i-vector.At the same time,the general linear discriminant analysis(PLDA)is also elaborated.In addition,several channel compensation methods for i-vectors,such as linear discriminant analysis transformation,length regularization,and data whitening,are also introduced.The Timit speech library is used to verify the experiment,and the conclusion is that the recognition rate is high in the pure speech environment and the robustness is poor in the noise environment.Next,this paper proposes a speaker recognition system based on dictionary learning and low rank matrix decomposition(LRSDL).Inspired by the application of dictionary learning and low rank matrix decomposition in the field of image and speech enhancement,we use the i-vector feature of the speaker as a dictionary atom,and add the low rank matrix decomposition to the dictionary learning,thus obtaining the speaker i-The low-rank dictionary of vector common features and noise makes the projection of the i-vector of the last test speech on the sub-dictionary less interfered by the commonality and the noise part,improving the accuracy of recognition.Finally,for the problem that the recognition rate of both methods is low when the signal-to-noise ratio is low,this paper proposes two solutions.First,when experimenting in a low SNR environment,the method of training speech with only such noise is no longer used.,but using i-vector with a mixture of noisy speech and pure speech in a certain proportion for training;secondly,a speaker recognition system based on LRSDL and i-vector fusion is proposed,in the scoring stage to i-vector\ The weighted average of the scores of the PLDA and LRSDL methods shows that both methods can effectively improve the system performance in a low SNR environment.
Keywords/Search Tags:speaker recognition, discriminant dictionary, i-vector, Low rank matrix
PDF Full Text Request
Related items