Font Size: a A A

Research On Structure And Parameter Optimization Of Deep Neural Network For Speech Enhancement

Posted on:2020-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ZhanFull Text:PDF
GTID:2428330575456537Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Speech enhancement is at the front end in the speech signal processing flow and plays a very important role in the field of speech processing.The purpose of speech enhancement is to remove noise from noisy speech,get as clean a speech as possible,improve speech quality,and enhance speech intelligibility.The nonlinear mapping capability enables the deep learning network for speech enhancement to directly learn the nonlinear relationship between noisy speech and clean speech without additional assumptions to the signal model,so it can be applied to more complex situation.This paper is mainly talking about deep neural network for speech enhancement.We study the structure and influence parameter of deep neural networks.In recent years,speech enhancement based on deep learning algorithms has become a hot topic in the field of speech enhancement.The configuration of deep neural network is very important for the performance of the network applied on speech enhancement and can even play a decisive role.From the perspective of engineering practice,this paper analyzes the basic characteristics of speech signal and the main goals of speech enhancement.Through detailed analysis and a large number of engineering experiments,we study and summary the effects of the structure and parameter configuration of deep neural networks on speech enhancement.Firstly,this paper studies the influence of parameters of networks.The system research and analysis are carried out for the influencing factors such as network training data volume,network depth,network width,activation function,loss function and generalization technology.We propose a deep neural network parameter configuration scheme which is suitable for solving the speech enhancement problem by regression method.Through a large number of experimental comparisons,it is proved that the network parameters configuration scheme proposed in this paper can effectively improve the effect of speech enhancement.Secondly,the influence of the neural network structure was studied.We mainly analyze the network characteristics of three basic network structures and their applications on speech enhancement separately.We study how to select the shape of CNN,the size and direction of CNN convolution core,and the performance of RNN and its variant structure through many experiments.Then,based on the experiment and analysis of the deep network structure,combined with the characteristics of the speech signal,the characteristics of different network structures are integrated,and a C-RNN network structure is proposed.Experiments have shown that the network has a strong ability to remove noise,especially in the case of low signal to noise ratio than networks with single structure.Finally,we propose a novel single channel speech enhancement method based on joint DNN and Log-MMSE as a whole network named LMMSE-DNN.The noisy speech is first denoised by the MMSE-log-STSA algorithm,and then the pre-denoised speech is optimized by the deep learning network to further improve the speech intelligibility.It has been proved by experiments that the speech auditory quality of the LMMSE-DNN network is higher than that of only MMSE-log-STSA or only DNN networks.This paper studies the performance of deep nueral network for speech enhancement from the perspective of engineering practice.We research on parameters selection and network structures for other researchers to improve their research efficiency.
Keywords/Search Tags:speech enhancement, deep learning, parameter configuration, DNN
PDF Full Text Request
Related items