Font Size: a A A

Research On Speech Separation Based On Deep Neural Network

Posted on:2020-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhaoFull Text:PDF
GTID:2438330626464268Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Voice information is the most important form of human communication,but in real life,voice information will be affected by background sound and affect the accuracy of voice information,which will affect people's communication.As the most important means of human communication,it is very important to have the ability to separate the target voice and background sound.Speech separation has always been an important research direction of speech processing.In order to improve the quality of target speech separation,researchers have proposed a variety of speech separation methods in recent decades.The early methods are very limited in the ability of mining non-linear structural information,so the performance of mono speech separation has been unsatisfactory.In recent years,with the development of the deep neural network,using the multi-layer nonlinear processing structure of the deep neural network to mine the structural information in the data and automatically extract the abstract feature representation has achieved good results in many research fields,so it is of great significance to gradually apply the deep neural network to the task of speech separation.The main contents of this paper are as follows:This paper uses a Recurrent Neural Network to build a speech separation model.Because the recurrent neural network has a strong learning ability,it has certain advantages for dealing with speech problems.Through simulation experiments,this method is compared with the typical traditional speech separation method(Robust Low-Rank Non-negative Matrix Factorization and Multiple Low-Rank Representation)have strong speech separation ability,and have improved performance in Global Signal to Distortion Ratio and Global Signal to Interference Ratio.This paper proposes a speech separation model based on Convolutional Neural Network and attention mechanism,using the amplitude spectrum of the mixed speech signal as input,which has high dimensionality.By analyzing the characteristics of the convolutional neural network and the attention mechanism,the Convolutional Neural Network can effectively extract low-dimensional features,mining the spatiotemporal structure information in the speech signal,and the attention mechanism can reduce the loss of sequence information.The accuracy of speech separation is effectively improved by combining the two mechanisms.Finally,simulation experiments are performed to verify the performance of the proposed speech separation method using the representative MIR-1K data set.By comparing the DRNN-2+discrim model,this method achieves 0.27 d B GNSDR gain and 0.51 d B GSIR gain.It shows that the speech separation method proposed in this paper has achieved ideal experimental results.
Keywords/Search Tags:Single-Channel Speech Separation, Deep Neural Network, Attention Mechanism
PDF Full Text Request
Related items