Research On Robust Binaural Localization Based On Deep Learning

Posted on:2020-08-28

Degree:Master

Type:Thesis

Country:China

Candidate:L J Wang

Full Text:PDF

GTID:2428330620456149

Subject:Information and Communication Engineering

Abstract/Summary:

As the front-end of the speech signal processing system,sound source localization(SSL)technology is widely used in video conferencing,hearing aids,intelligent robots and other occasions.As the range of SSL method,binaural sound source localization(BSSL)has the advantage of miniaturization of equipment.Most of the previous BSSL studies use ITD(Interaural Time Difference)and IID(Interaural Intensity Difference)to simulate the human ear hearing mechanism,but the localization performance in the reverberation and noisy environments deteriorates rapidly.Based on the binaural space cues,this thesis combines the convolutional network and the residual network in deep learning to study the robust BSSL algorithms.Two BSSL algorithms based on deep learning are proposed in this thesis: BSSL algorithm based on deep convolutional neural network(DCNN)and BSSL algorithm based on deep convolutional residual network(DCRN).(1)DCNN-based BSSL algorithm.The BSSL algorithm based on DCNN introduces convolution operations,which implement different DCNN models with one-dimensional convolution and two-dimensional convolution respectively.This thesis uses feature fusion to combine features on different sub-bands to avoid repeated model training.In addition,the parameter sharing characteristics of convolution operations which introduced by DCNN greatly reduce the redundant parameters of the model and accelerate the training process of the network.The test results in various reverberation and signal-to-noise ratio environments show that the DCNN model is robust,and compared with sub-band DNN algorithm,the DCNN model improves the localization accuracy in high SNR and high reverberation environments by 11 percentage points.(2)DCRN-based BSSL algorithm.The DCRN-based BSSL algorithm is an improvement of the DCNN algorithm.The DCRN algorithm introduces the residual structure on the basis of the convolutional network to reduce the difficulty of model training,so that the DCRN model can be trained with a deeper number of layers.In addition,the BN(Batch Normalization)layer is also introduced in the DCRN,which further accelerates the convergence of the model.The test results show that the DCRN model not only has better generalization performance for the unknown reverberation and signal-to-noise ratio environments,but also its localization performance is better than the DCNN model.Compared with the DCNN model,The DCRN13 model has an average 2% performance improvement under different environments.

Keywords/Search Tags:

Binaural Sound Source Localization, Deep Learning, Convolutional Network, Residual Network, Feature Fusion

Related items

1	Research On Indoor Sound Source Localization Algorithm Based On Deep Learning
2	Research On Sound Source Recognition And Location Technology Based On Deep Learning
3	Research On Robust Binaural Localization Based On Neural Network
4	Sound Source Localization Based On Binaural Auditory Time Delay Estimation
5	Recognition Method Of The Time Varying Azimuth Of Single Sound Source Based On Binaural Hearing Mechanism
6	Computational models for binaural sound source localization and sound understanding
7	Study On Sound Source Localization Algorithm Based On Binaural Auditory And Naive Bayes Theory
8	Research On Sound Source Localization Algorithm Based On Binaural Signals In Reverberant Environment
9	Research On Multi-sound Event Localization And Detection Method Based On Deep Learning
10	Research On Binaural Sound Source Localization Method For Mobile Robots Oriented To Human-computer Interaction