Research On Deep Learning Speech Enhancement Algorithms That Effectively Improve Speech Intelligibility

Posted on:2021-09-26

Degree:Master

Type:Thesis

Country:China

Candidate:H B Fang

Full Text:PDF

GTID:2518306110995229

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Deep neural networks based mapping or classification architectures for speech enhancement have achieved substantial improvement,but there is still a room for further improvement.Therefore,we first improve the cost function used for optimization of training stage in DNN based speech enhancement methods,and propose a deep learning speech enhancement algorithm based on the perceptual related cost function to effectively reduce the mismatch between the training cost function and human auditory perception.Next,by analyzing the architecture of the conventional speech enhancement algorithm and DNN-based speech enhancement methods,and complementing the two advantages,a DNNbased suppression gain estimation method for speech enhancement is proposed to achieve further improvement of intelligibility performance.First,the supervised learning methods based on different cost functions for speech enhancement are studied.The mean squared error(MSE)cost function between the network output and the training target is different from the human auditory perception based evaluation criterias,so the use of the MSE cost function for network model optimization does not guarantee speech intelligibility can be improved;frequency-weighted segmental SNR(fw SNRseg)is an objective evaluation standard of speech intelligibility which can reflect human auditory perception.Therefore,by taking the evaluation criterion in the network parameter training,this paper proposes a deep learning speech enhancement algorithm based on the perceptual related cost function.Systematic objective evaluations show that our proposed method compared with the DNN method based on the MSE cost function,the short-time objective intelligibility(STOI)score of a test speech is further improved while maintaining the speech quality no longer impaired in a wide range of noise types and signal-to-noise ratios.Next,we examine several DNN based speech enhancement algorithms.Although the mapping based end-to-end regression DNN model for speech enhancement can effectively remove noise component,the problem of speech distortion caused by this method is more serious.Estimation of suppression gain plays an important role in the conventional single-channel architecture for speech enhancement,and by combining the DNN methods with the single-channel conventional speech enhancement framework,we propose a single DNN-based suppression gain estimation speech enhancement algorithm.In addition,the input of the DNN has been expanded with a causal context to achieve real-time signal processing.Multiply the suppression gain of each frequency bin and the corresponding noisy magnitude spectrum to obtain an enhanced magnitude spectrum.On this basis,multiple DNNs are used to estimate clean speech amplitude spectrum,speech presence probability and suppression gain respectively,and explicit noise variance estimation is introduced.A structured DNN-based suppression gain estimation method is proposed.In addition,these DNN methods are trained based on the above-mentioned perceptual-related cost function.Finally,by comparing the evaluation results of these DNN methods,it is shown that incorporating DNN methods into a statistical noise suppression system and replacing certain estimators of the system yields better STOI results than employing a simple regression DNN to estimate the clean speech directly.

Keywords/Search Tags:

Deep Neural Networks, Human Auditory Perception, Cost Function, Suppression Gain, Speech Enhancement

PDF Full Text Request

Related items

1	Speech Enhancement Under Condition Of Low SNR
2	Speech Enhancement Based On Auditory Masking And Auditory Wavelet Packet Decomposition
3	Speech Separation Research Based On Human Auditory Characteristics
4	Research On Deep Learning Based Speech Dereverberation Method
5	Speech Enhancement Based On Deep Neural Network And Recurrent Neural Network
6	Speech Enhancement Of Dual Channel DNN Hearing Aid Based On Human Hearing
7	On Stacked And Deep Neural Netword With The Applaction Of Speech Separation
8	Speech Enhancement Based On Masking Properties Of The Human Auditory System
9	The Research Of Speech Enhancement Algorithms Based On Spectral Estimation Statistical Model
10	Research On Speech Enhancement Method Based On Deep Learning Neural Networks