Font Size: a A A

Research Of Robust Speaker Recognition In Deep Learning Framework

Posted on:2021-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:C F MaFull Text:PDF
GTID:2428330647961455Subject:Control engineering
Abstract/Summary:PDF Full Text Request
Speaker recognition,as an important part of biometrics recognition,can be widely used in the fields of military security,public security justice,biomedical engineering systems.At present,under a quiet laboratory environment and sufficient voice data,speaker recognition technology has achieved satisfactory results.However,the actual application environment cannot be predicted,resulting in poor robustness of the speaker recognition system.Therefore,in order to improve the robustness of the speaker recognition method,this thesis raises the problems of insufficient feature expression in speaker recognition,inadequate model differentiation,independent training of each module in the traditional method.The following solutions are proposed: 1)Robust speaker recognition method based on deep and shallow feature fusion;2)Robust speaker recognition back-end classification decision method based on depth model;3)Based on end-to-end joint optimization and Robust speaker recognition method for decision making.1)In terms of the problem of insufficient feature expression ability in speaker recognition,a robust speaker recognition method based on deep and shallow feature fusion is proposed.This method takes the block MFCC features processed by the deep neural network as deep features,and uses the Gaussian supervector as the shallow features.Then this method fuses the two feature vectors to obtain more robust and robust information fusion features.Since the deep and shallow features reflect the speaker information from different levels,effective fusion way may achieve the complementarity between the deep and shallow features,which can more fully characterize the speaker.In order to make better use of the correlation between feature blocks,voting decision mechanism is introduced in the decision process to further improve the robustness of the system.2)Aiming at the problem of inadequate model discrimination in speaker recognition,a robust speaker recognition back-end classification decision method based on depth model is proposed.Based on the Gaussian supervector,the way explores the effects of different depth models as the back-end of speaker recognition to classify the characteristics of traditional speakers,and seek the best classification model.Using its excellent classification ability,it can effectively extract the deeper and more valuable information contained in the speech fragments to obtain a more robust speaker recognition system.3)Regarding the problem that independent training of each module in the traditional method,a robust speaker recognition method based on end-to-end joint optimization and decision-making is proposed.First,a custom filter is used to replace the convolution kernel in the convolutional network.Second,the approach build a deep residual network based on self-attention mechanism.Finally,by unifying the feature extraction and model matching in traditional speaker recognition into the deep model structure,the joint optimization of parameters is achieved,and the system performance under noisy environment is also improved.
Keywords/Search Tags:speaker recognition, deep learning, robustness, feature expression, model match, end to end
PDF Full Text Request
Related items