Font Size: a A A

Speaker Recognition Based On Fusion Of RBPF And DNN

Posted on:2022-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:X X ZhangFull Text:PDF
GTID:2518306785475104Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
In the era of intelligence,speaker recognition technology can bring users more personalized life services.At present,the research of speaker recognition with deep learning as the theoretical framework has made important breakthroughs.However,the performance of the speaker recognition system still faces challenges in practical applications.Especially,the noise problem has become an important factor hindering the future commercial development of speech recognition technology.Thence,this thesis focuses on how to improve speaker recognition performance in a noisy environment.The main contents are as follows:(1)A speaker recognition method based on fusion of Rao-Blackwellised particle filter(RBPF)and Deep Belief Network(DBN)is proposed.Firstly,this method uses Discrete Fourier Transform to convert the noisy speech signal into the frequency domain,and then RBPF based on a low-order time-varying auto-regressive model(TVAR)is used to enhance each frame spectrum data.And then principal component analysis(PCA)whitening is used to process the enhanced spectrum data,and principal component features are extracted as the input of DBN.Finally,the speaker's identity is identified though DBN adaptive feature learning.Compared with speaker recognition methods based on other speech enhancement methods,the recognition performance of the proposed speaker recognition method is better under different noisy conditions.Compared with the timedomain RBPF enhancement method,the running speed of the frequency-domain RBPF enhancement method is improved by about 80%.(2)A feature extraction method based on Stacked Extended Denoising Autoencoder(SEDAE)network is proposed.This method introduces speaker's auxiliary information through label constraints.On the basis of unsupervised representation learning,the parameters of coding layer are fine-tuned with supervised learning.The features extracted in this way are not only representative,but also more suitable for classification.In view of the difficulty of parameter adjustment in traditional semi-supervised learning methods,it is innovatively proposed to define the auto-encoder network learning process with label constraints as a meta-optimization problem.And meta learning method is used to realize the self-learning of network.Finally,the features extracted from SEDAE are input into the Convolutional Neural Network(CNN)for speaker recognition.Compared with the baseline methods,the proposed method shows better robustness.The test results with corpus under 10 dB white noise show that the average recognition rate of the proposed method is about 95%,which is better than that speaker recognition method based on RBPF and DBN.(3)A speaker recognition system platform is built to realize the voiceprint control function of the access control system.Firstly,the construction of speaker recognition algorithm and the design of graphical user interface are completed based on MATLAB.Users can complete the speaker recognition through voice recording.Then the relay control module is designed based on STC89C52 MCU.The MCU receives the command signal from the PC terminal,and checks and decodes the command data to realize the on-off control of the relay.Finally,the fusion model of speaker recognition algorithm based on the score level is tested by the system,and the optimal fusion factor is determined.
Keywords/Search Tags:Speaker recognition, deep neural network, Rao-Blackwellised particle filter, Stacked Extended Denoising Autoencoder, voice access control system
PDF Full Text Request
Related items