Binaural Speech Separation Research Based On Deep Learning

Posted on:2021-10-30

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Lin

Full Text:PDF

GTID:2518306476450264

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Speech separation is widely used in speech signal processing systems and artificial intelligence systems.In the real environment,the traditional speech separation algorithm has the problems of poor generalization performance under low signal-to-noise ratio and high reverberation.In this thesis,the binaural speech separation method based on deep neural network is studied with the characteristics of human ear hearing perception,based on spatial and spatial features and spectral features.The thesis mainly proposes the following two algorithms: Convolutional Neural Networks binaural speech separation method based on front and back frame information,and Deep Cluster(Deep Cluster)speech separation algorithm based on spectral map and spatial features.(1)CNN binaural speech separation method based on front and back frame features.The algorithm proposed is based on the time-frequency analysis characteristics of the human ear simulated by the Gammatone filter bank,the original speech signal is processed to obtain a time-frequency unit,and the binaural spatial feature parameters are extracted from the timefrequency unit,including the cross-correlation function CCF(Cross Correlatin Function),Internaural time difference ITD(Internaural Time Difference)and interaural intensity difference ILD(Internaural Level Difference).Past speech separation algorithms only used the information of the current frame for speech separation,and this paper uses features that speech has continuity in time sequence.After the spatial features are extracted,the spatial clues of the two frames before and after the current frame are stitched.The spatial feature map between ears is obtained as the input of the convolutional neural network.The paper uses SAR(Sources to Artifacts Ratio),SIR(Source to Interferences Ratio),SDR(Source to Distortion Ratio)and PESQ(Perceptual Evaluation of Speech Quality)as evaluation indicators of speech separation.The simulation results show that this algorithm is significant at low signal-to-noise ratio Better than DNN(Deep Nerual Networks)based on IBM(Ideal Binary Mask).(2)Deep clustering speech separation algorithm based on spectrogram and spatial features.Because speech has correlation in time series,the use of recurrent neural network RNN(Recurrent Neural Networks)can better model speech signals.In this paper,bi-directional long short-term memory unit Bi LSTM(Bi-directional Long Short-Term Memory)is used as the encoder to extract the logarithmic amplitude spectrum of the speech signal and the inter-ear phase difference(IPD)as the input feature vector.Cells are mapped to high-dimensional space.Finally,in the test,the high-dimensional space vectors are used to classify the time-frequency units through K-Means clustering,and combined with mixed speech to reconstruct the target signal.Experimental results show that the speech separation algorithm based on deep clustering makes full use of spectral information and spatial information.Compared with CNN-based networks,it has a significant improvement in SAR,SIR and SDR,and has good separation performance...

Keywords/Search Tags:

Binaural Speech Separation, Convolutional Neural Network, Deep Cluster, Computational Auditory Scene Analysis

PDF Full Text Request

Related items

1	Binaural Speech Separation Research Based On Deep Learning
2	Speech Separation Research Based On Human Auditory Characteristics
3	Research On Speech Enhancement Based On Computational Auditory Scene Analysis
4	Speech Separation Based On Microphone Array And Deep Learning
5	Monophonic Speech Separation Based On Computational Auditory Scene Analysis
6	The Blind Separation Of Monaural Speech Based On Computational Auditory Scene Analysis
7	Segregation Of Reverberant Speech Based On Computational Auditory Scene Analysis And Deep Neural Network
8	The Research Of Speech Separation Based On Computational Auditory Scene Analysis
9	Method And Implementation Of Monophonic Double Speech Separation Based On Auditory Scene Analysis
10	Computational Auditory Model And Deep Neural Network Based Binaural Speech Segregation