Speech Separation Research Based On Human Auditory Characteristics

Posted on:2019-04-17

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Zhu

Full Text:PDF

GTID:2428330596960605

Subject:Signal and information processing

Abstract/Summary:

PDF Full Text Request

Speech separation is the front-end of the speech signal processing system,and its performance directly affects the performance of the entire system.Most speech separation algorithms aim to separate the targets with fixed azimuth,and performance degraded dramatically in the reverberation and noisy environment.In this thesis,based on the perceptual characteristics of the human hearing system,a robust binaural speech separation algorithm is studied based on binaural spatial features and spectral features.The algorithms proposed in this thesis mainly include two aspects: binaural speech separation algorithm based on deep neural network and binaural speech separation algorithm based on hybrid features.(1)Binaural speech separation algorithm based on deep neural network.This thesis expands the basic unit of separation from the frequency to the sub-band,and considers the forward speech separation as a multi-classification problem.The binaural spatial features extracted in this thesis are the interaural time difference(ITD),the interaural level difference(ILD),and the cross-correlation function(CCF).A neural network with two hidden layers and top-level Softmax is trained.The probability is determined by localizatioin of target speakor.Taking the larger probability as the attribution of the time-frequency unit and realize the speech separation between any two azimuths in the forward direction.The ideal binary mask IBM(Ideal Binary Mask)is smoothed to improve the auditory perception of speech.In this thesis,Sources to Artifacts Ratio(SAR),Source to Distortion Ratio(SDR),Source to Interference Ratio(SIR)and Perceptual Evaluation of Speech Quality(PESQ)are used as evaluation indicators.Simulation results show that this algorithm is superior to DUET(Degenerate Unmixing Estimation Technique)algorithm.(2)Binaural speech separation algorithm based on hybrid features.Using computational auditory scene analysis to combine spatial and spectral features for binaural speech separation.The speaker separation and speech enhancement are analyzed separately.In the spectral feature extraction module,the binaural signal is pre-processed using a beamformer to generate a mono signal for extracting spectral features,and then the feature parameters commonly used for mono-separated spectra are extracted.After the combination of spectral features and spatial features,it was input into deep neural network for training,and simulation experiments were conducted in various environments.In this thesis,STOI(Short-time Objective Intelligibility)and PESQ are used as evaluation indicators.The simulation results show that in speech enhancement,the combination of spatial features and spectral features can improve the quality of speech separation,and has good normalization performance in different reverberation environments.

Keywords/Search Tags:

Deep neural network, speech separation, computational auditory scene analysis

PDF Full Text Request

Related items

1	Binaural Speech Separation Research Based On Deep Learning
2	Speech Separation Based On Microphone Array And Deep Learning
3	Monophonic Speech Separation Based On Computational Auditory Scene Analysis
4	The Blind Separation Of Monaural Speech Based On Computational Auditory Scene Analysis
5	Binaural Speech Separation Research Based On Deep Learning
6	The Research Of Speech Separation Based On Computational Auditory Scene Analysis
7	Segregation Of Reverberant Speech Based On Computational Auditory Scene Analysis And Deep Neural Network
8	Method And Implementation Of Monophonic Double Speech Separation Based On Auditory Scene Analysis
9	Speech Separation Based On Computational Auditory Scene Analysis
10	Multi-person Speech Separation Method Based On Computational Auditory Scene Analysis