Font Size: a A A

A Research On Speech Enhancement Algorithm Based On Mask Estimation

Posted on:2021-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:J Q JiangFull Text:PDF
GTID:2428330623968344Subject:Engineering
Abstract/Summary:PDF Full Text Request
Speech signal is one of the indispensable signals in people's daily life,and there are numerous technologies and applications that take speech signal as the processing object.Speech enhancement is one of the most concerned speech signal processing technologies.It is widely used in speech coding,speech recognition,listening assistance and military communication.With the development of artificial intelligence,machine learning technology is gradually applied to speech enhancement technology.Compared with traditional speech enhancement algorithms,the fusion of machine learning algorithms improves the performance of speech enhancement algorithms,but it also brings new problems,including the selection of machine learning models,the selection of speech signal characteristics,the application of the output of the model,etc.In view of the above problems,this thesis mainly studies the speech enhancement algorithms using three different machine learning models,which are based on the array signal processing and get mask estimation of the speech signal characteristics through the model,so as to obtain more accurate estimation of the parameters of beamforming algorithm and improve the performance of the speech enhancement algorithm.The specific contents are as follows.1)The speech enhancement algorithm based on SVM is studied and a nonnormalized weighted frequency fusion method is proposed.In this thesis,we study the process of normalized weighted fusion of specific frequency bands using the estimated frequency masks of speech signals and propose a non-normalized weighted fusion method to improve the accuracy of broadband DOA estimation and the speech enhancement performance.We analyze the two methods theoretically,and perform simulations for the two methods to compare with the traditional broadband DOA estimation algorithm.The effectiveness of the proposed algorithm and the lack of robustness of the two methods to the array error are verified.2)The speech enhancement algorithm based on CGMM is investigated.In this thesis,we study the process of estimating the time-frequency masks of speech signals based on the CGMM and estimation of steering vectors and covariance matrices using masks.We carry out simulations for this algorithm and compare it with the speech enhancement algorithm using SVM as well as the speech enhancement algorithm based on the traditional broadband DOA estimation.The robustness of this algorithm to array errors is verified.3)The speech enhancement algorithm based on CNN is studied,and a method estimating parameters for beamforming using time-frequency binary masks is proposed.In this thesis,we investigate the process of estimating the steering vectors and the covariance matrices by using the speech presence probability masks,and propose a method using the speech presence probability masks to estimate the time-frequency binary masks and beamforming parameters.We analyze the two methods theoretically,and carry out simulations for the two methods to compare with the speech enhancement algorithm using SVM and CGMM.The effectiveness of the proposed algorithm and the robustness of the two methods to the array error are verified.
Keywords/Search Tags:Speech enhancement, Mask estimation, Machine Learning, Array signal processing
PDF Full Text Request
Related items