With the continuous advancement and development of science and technology,voice is the most convenient and efficient way of communication between people and machines,and between people.However,the complex noise environment in life greatly interferes with voice communication.Therefore,an excellent speech enhancement system can enable people to better experience the target speech with high quality and high intelligibility.This paper combines the traditional speech enhancement algorithm MMSE-LSA idea,and the prior SNR estimator based on deep learning to propose a temporal convolutional network single-channel speech enhancement algorithm based on deep learning,and further discusses the damage to the speech enhancement process.In view of the feasibility of restoration,a single-channel speech restoration algorithm based on deep learning is proposed.The specific work content is as follows:First of all,this paper mainly makes further improvements around the traditional speech enhancement method based on the MMSE-LSA gain function and introduces noise spectrum perception.The algorithm mainly modifies the magnitude spectrum information of the current noisy speech by estimating the prior SNR of the pure speech signal,while retaining the phase information,and finally outputs the enhanced speech.However,the maximum likelihood method and the direct decision method adopted by the traditional algorithm estimate the priori signal-to-noise ratio at a low SNR such as 0d B,the speech is submerged in the noise,the priori signal-to-noise ratio estimation is inaccurate,and there will be a certain amount of noise residue.To address this issue,this paper proposes a deep learning-based prior SNR estimation model.Training the Res LSTM temporal convolutional neural network through many speech data sets with different signal-to-noise ratios can accurately estimate the prior signal-to-noise ratio of the current frame.Experimental results show that.The temporal convolutional network speech enhancement algorithm based on deep learning designed in this paper has less noise residue than the traditional speech enhancement algorithm in the case of low signal-to-noise ratio,and the denoising effect is good.Secondly,in the actual use of the speech enhancement algorithm designed in this paper,it is found that although the denoising effect is good in the case of low signal-to-noise ratio,because of the low signal-to-noise ratio,it is inevitable that the original speech information will be damaged during the enhancement process.cause damage.Aiming at this,this paper proposes a speech restoration algorithm based on the calculation of perceptual feature loss,extracting the feature vectors of the repaired speech through the speech perception feature extraction network,that is,the speech after passing through the speech repair network and the pure speech at different time resolutions,and calculate the feature loss to continuously update the weight parameter fitting model of the speech inpainting network.Experimental results show that the speech restoration network designed in this paper can restore part of the speech damaged in the speech enhancement process when the SNR is low.Finally,a complete speech enhancement system is formed by combining the temporal convolutional network speech enhancement algorithm based on deep learning and the speech restoration algorithm designed in this paper.The system can effectively remove noise under low signal-to-noise ratio of 0d B while retaining the integrity of the original voice signal. |