Font Size: a A A

Research On Robust Binaural Localization Based On Deep Learning

Posted on:2020-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:L J WangFull Text:PDF
GTID:2428330620456149Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
As the front-end of the speech signal processing system,sound source localization(SSL)technology is widely used in video conferencing,hearing aids,intelligent robots and other occasions.As the range of SSL method,binaural sound source localization(BSSL)has the advantage of miniaturization of equipment.Most of the previous BSSL studies use ITD(Interaural Time Difference)and IID(Interaural Intensity Difference)to simulate the human ear hearing mechanism,but the localization performance in the reverberation and noisy environments deteriorates rapidly.Based on the binaural space cues,this thesis combines the convolutional network and the residual network in deep learning to study the robust BSSL algorithms.Two BSSL algorithms based on deep learning are proposed in this thesis: BSSL algorithm based on deep convolutional neural network(DCNN)and BSSL algorithm based on deep convolutional residual network(DCRN).(1)DCNN-based BSSL algorithm.The BSSL algorithm based on DCNN introduces convolution operations,which implement different DCNN models with one-dimensional convolution and two-dimensional convolution respectively.This thesis uses feature fusion to combine features on different sub-bands to avoid repeated model training.In addition,the parameter sharing characteristics of convolution operations which introduced by DCNN greatly reduce the redundant parameters of the model and accelerate the training process of the network.The test results in various reverberation and signal-to-noise ratio environments show that the DCNN model is robust,and compared with sub-band DNN algorithm,the DCNN model improves the localization accuracy in high SNR and high reverberation environments by 11 percentage points.(2)DCRN-based BSSL algorithm.The DCRN-based BSSL algorithm is an improvement of the DCNN algorithm.The DCRN algorithm introduces the residual structure on the basis of the convolutional network to reduce the difficulty of model training,so that the DCRN model can be trained with a deeper number of layers.In addition,the BN(Batch Normalization)layer is also introduced in the DCRN,which further accelerates the convergence of the model.The test results show that the DCRN model not only has better generalization performance for the unknown reverberation and signal-to-noise ratio environments,but also its localization performance is better than the DCNN model.Compared with the DCNN model,The DCRN13 model has an average 2% performance improvement under different environments.
Keywords/Search Tags:Binaural Sound Source Localization, Deep Learning, Convolutional Network, Residual Network, Feature Fusion
PDF Full Text Request
Related items