Font Size: a A A

Speech Enhancement Technique Based On Nonnegative Matrix Factorization And Time-frequency Masking Estimation

Posted on:2020-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:B F YanFull Text:PDF
GTID:2428330623456387Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With active demand for the mobile phones and IP phones,the performance reqiurements of voice codecs have been increased.However,the presence of background noise deteriorates the quality and intelligibility of speech.Speech enhancement is the core technology of voice codecs,which determines the performance of codecs.Therefore,this study attempts to develope some speech enhancement methods applicable in the Enhanced Voice Services(EVS)codec at low signal-to-noise ratio and non-stationary noise coditions.The work of this paper mainly focuses on nonnegative matrix factorization(NMF)and time-frequency masking estimation methods,which includes the following three aspects:Firstly,since more residual noise is produced by the inaccurate estimation of speech basis matrix and noise basis matrix in the NMF-based methods and the timefrequency masking estimation methods have a poorer performance at the high frequency,a novel speech enhancement method is proposed,which integrates the NMF and time-frequency masking methods.The basis matrices of speech and noise via the NMF is obtained at offline training stage,and the Wiener filter of frequency domain is constructed and transformed to the Gammatone domain.An integrated Wiener-like filter is obtained by combining the NMF and time-frequency masking estimation at online enhancement stage.The experiments present the superiority of the proposed method than the reference methods.Secondly,at offline training phase,because of inaccuracy of the training target estimation,it is easy to present a poor performance in speech enhancement.Thus,a modified time-frequency masking based on deep neural network(DNN)is proposed for speech enhancement.At offline training phase,the training target is optimized by constructing a new objective function of deep neural network,then the enhanced speech is obtainded at online phase.The results demonstrate that the proposed approach improved speech quality effectively.Finally,the paper will introduce the engineering applications of EVS codec.as the pre-processing modual,the DNN-based speech enhancement method based on the improved time-frequency masking estimation is embedded in the the EVS codec for improving output speech quality.The results demonstrate that the proposed approach better improved speech quality and intelligibility than the original EVS codec.This lays a certain practical value for the practical applications.
Keywords/Search Tags:Speech enhancement, Nonnegative matrix factorization, Time-frequency masking, Wiener filter, Enhanced Voice Services codec
PDF Full Text Request
Related items