Font Size: a A A

Speech Enhancement Using Nonnegative Matrix Factorization With The Constrained Speech Spectrum

Posted on:2020-02-29Degree:MasterType:Thesis
Country:ChinaCandidate:Z G BaiFull Text:PDF
GTID:2428330623956594Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Speech enhancement based on nonnegative matrix factorization(NMF)is currently a very effective technique for suppressing nonstationary noise,which represents spectral subspaces of speech and noise using nonnegative basis matrices.Noisy speech spectrum vector is decomposed in the basis matrices to obtain the corresponding encoding vectors of speech and noise.Then,the estimated speech and noise spectral components are obtained to achieve speech enhancement.However,there are two drawbacks in this method.First,when the basis matrices of speech and noise overlap in practice,the method will cause confusion between the speech and noise sources.In addition,the matched noise basis matrix needs to be trained,while the type and characteristic of the background noise cannot always be known in advance.In this paper,three improved methods are proposed to address these two drawbacks:Firstly,a speech enhancement method based on codebook constrained nonnegative matrix factorization is proposed in this paper.In the training phase,a speech codebook is trained to model the magnitude spectrum of clean speech.In the enhancement phase,first,the magnitude spectrum of noise is estimated.The estimated noise magnitude spectrum and each entry in the codebook are used to construct the basis matrices of noisy speech.Then,the magnitude spectrum of noisy speech is decomposed in the constructed basis matrices,so that the optimal basis matrix and the optimal decomposition are selected,and the estimated speech and noise components are obtained.Finally,the obtained speech and noise components are used to construct a filter to achieve speech enhancement.In this method,the basis matrix of enhancement phase is skillfully constructed using the speech entry and the estimated noise spectrum without training the speech and noise bases.And the noise spectrum is estimated online,so the source confusion problem and the mismatch of the noise basis matrix are better avoided.Secondly,in this paper,a novel method is proposed to predict the NMF-based Wiener filter using deep neural networks(DNN)for speech enhancement.The NMFbased Wiener filter,as a masking-based training target,is more suitable for parameter estimation.The intermediate error of the speech enhancement process is reduced due to direct prediction of the NMF-based Wiener filter.In addition,the features of noisy speech are extracted with the NMF algorithm and normalized to zero mean and unit variance to obtain more discriminative input features.Due to the powerful modeling capability of the DNN,it is used to learn a nonlinear mapping from the noisy speech features to the NMF-based Wiener filter,which solves the source confusion problem.Finally,an NMF-based speech enhancement method is proposed in this paper in which the noise basis matrix is updated online.First,the non-speech regions of noisy signal are determined by utilizing a decision module of non-speech frame.Then,a fixed-length sliding window is used to cover several recent past frames of noisy speech determined as non-speech,and the magnitude spectrums of these non-speech frames are used to update the noise basis matrix online.After that,the updated noise basis matrix and the pre-trained speech basis matrix are used to achieve speech enhancement.This method can obtain the matched noise basis matrix online and effectively solve the problem of the mismatch of the noise basis matrix.
Keywords/Search Tags:Speech enhancement, nonnegative matrix factorization, priori codebook, deep neural networks, Wiener filter
PDF Full Text Request
Related items