Speech Separation Based On Microphone Array And Deep Learning

Posted on:2021-08-27

Degree:Master

Type:Thesis

Country:China

Candidate:L Y Chen

Full Text:PDF

GTID:2518306476950249

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

As the front end of the speech signal processing system,the speech separation technology directly affects the performance of the speech signal processing system.The performance of traditional speech separation algorithms is severely degraded in high reverberation and low SNR environments.Based on Computational auditory scene analysis(CASA),this disseratation combines the spatial information of the microphone array and deep neural network to propose two speech separation algorithms: DNN microphone array speech separation algorithm based on improved Stepped Response Power Phase Transform(SRP-PHAT)and Temporal Convolution Residual Neural Network(TC-ResNet)microphone array speech separation algorithm.(1)DNN microphone array speech separation algorithm based on improved SRP-PHAT.The algorithm uses the spatial features of the microphone array to achieve speech separation.The existing separation algorithms are mainly concentrated in binaural speech separation,which can only separate speech from forward dirction.The proposed algorithm combines the spatial features SRP-PHAT of the microphone array and the Gammatone human hearing filter bank to propose an improved SRP-PHAT feature for multi-speaker speech separation.The spatial featrues are trained by DNN model,and the training targets are 36 azimuths.The testing environments include noise and reverberations.The simulation results show that the algorithm achieves omnidirectional speech separation,and can still perform well in low SNR and high reverberation environments.(2)Microphone array speech separation algorithm based on TC-ResNet.The convolution neural network(CNN)is ultized based on the time sequence of speech signal.Temporal Convolution and Residual Block are also added to the network.Temporal Convolution not only enlarges the perception domain of the lower convolution layer,but also significantly reduces the amount of network computation.Residual blocks can be used to combine different resolution features.The simulation results show that the microphone array speech separation algorithm based on TC-ResNet have better performance in low SNR and high reverberation environments.

Keywords/Search Tags:

Deep Neural network, Speech Separation, Computational Auditory Scene Analysis, spatial featrures, convolution neural network

PDF Full Text Request

Related items

1	Speech Separation Research Based On Human Auditory Characteristics
2	Binaural Speech Separation Research Based On Deep Learning
3	Segregation Of Reverberant Speech Based On Computational Auditory Scene Analysis And Deep Neural Network
4	Binaural Speech Separation Research Based On Deep Learning
5	Research And Verification Of Monaural Speech Segregation Based On Computational Auditory Scene Analysis And Deep Neural Network
6	On Stacked And Deep Neural Netword With The Applaction Of Speech Separation
7	Monophonic Speech Separation Based On Computational Auditory Scene Analysis
8	The Blind Separation Of Monaural Speech Based On Computational Auditory Scene Analysis
9	The Research Of Speech Separation Based On Computational Auditory Scene Analysis
10	Method And Implementation Of Monophonic Double Speech Separation Based On Auditory Scene Analysis