Font Size: a A A

Speech Enhancement Algorithm Based On Deep Neural Network

Posted on:2020-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2428330596985781Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Speech is essential for communication between people and is the natural interface between people and machines.In reality,due to the presence of background noise,the perceptual quality and intelligibility of speech are generally reduced,which in turn leads to a significant deterioration in the overall performance of the voice communication device.Speech enhancement is to reduce the noise level in noisy speech and improve the quality and intelligibility of speech.With the successful application of deep learning technology in the field of speech and artificial intelligence,speech enhancement technology based on supervised learning Deep Neural Network(DNN)can solve the problem that the traditional unsupervised speech enhancement algorithm has excessive residual noise due to unreasonable assumptions about the statistical characteristics of speech and noise.In view of this,this paper starts from the analysis of the advantages and shortcomings of the existing supervised speech enhancement research,and proposes an optimization method based on deep neural network speech enhancement from two aspects of feature extraction and model establishment.The specific research contents of this paper are as follows:(1)The research background and significance,research status and evaluation methods of speech enhancement were introduced.The principle of supervised speech enhancement algorithm based on Non-negative Matrix Factorization(NMF)and DNN was analyzed in detail.The specific process and training algorithm were introduced.(2)In order to improve the generalization ability of DNN model to predict enhanced speech without any assumptions about speech and noise,a speech enhancement algorithm based on features joint optimization DNN was proposed.Considering the auditory characteristics of the human ear and the complementarity between the dynamic and static features,firstly,the logarithmic power spectrum characteristic was filtered by Mel filtering to obtain a new feature logarithmic power spectrum in the Mel field;then,the second type of continuous feature Mel cepstrum coefficient,and the differential characteristics was combined to form joint features;finally,the effects of single feature and joint feature on the performance of the system were studied and analyzed,and proved that the joint feature can effectively improve the enhancement performance of the network,which verify the superiority of feature combination.(3)Aiming at speech enhancement based on DNN with time-frequency masking value as the target,the problem that the potential structure of mining speech can not accurately estimate the time-frequency masking value is ignored,combined with the advantages of NMF and the sparsity of speech signal,an enhancement algorithm for optimizing DNN model using sparse NMF(SNMF)is proposed.Firstly,the model of SNMF optimized DNN was designed,and the specific process of the speech enhancement algorithm system using SNMF to optimize DNN was analyzed.Then,under different noise conditions,three groups of speech enhancement experiments were carried out when the models were SNMF,DNN and SNMF-DNN,respectively.It is verified that the SNMF-DNN optimization model can improve the estimation ability of the output layer for the time-frequency masking value,and then improve the quality and intelligibility of the enhanced speech,which proves the effectiveness and applicability of the optimization model.
Keywords/Search Tags:speech enhancement, deep neural network, feature extraction, sparse non-negative matrix factorization, optimization
PDF Full Text Request
Related items