Font Size: a A A

Research On Deep Learning-based Identification Of Multi-speech Sources Using A Small-scale Microphone Array

Posted on:2022-08-04Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhangFull Text:PDF
GTID:2518306536988309Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Speech is one of the most effective ways of human communication.Noise,reverberation,and aliasing between multiple sound sources in a multi-speaker scenario make it more challenging to localize and separate the mixed signals of multiple speech sources.This thesis aims at the local-ization and separation of far-field multiple speech sources received by a small-scale microphone array.In order to improve the speech signal localization performance on small-scale microphone arrays,a Convolutional Neural Network(CNN)based direction of arrival(DOA)estimation algo-rithm for multiple speech source,as well as multiple speech source counting and DO A joint esti-mation algorithm are proposed.The joint estimation algorithm realizes simultaneous estimation of the number of sound sources and DOA through the Multi-task Learning(MTL)model.Simulation and experimental data processing results have verified that the proposed speech DOA estimation method based on deep learning performs better than the traditional high-resolution DOA estimation techniques in multiple speech source localization.For improvement of single-channel multi-speaker speech separation algorithm in the far-field environment,a multi-channel speech separation method is proposed based on temporal convolu-tional network(TCN).Then a multi-channel time-domain audio separation network(Tasnet)is designed based on spatial features while a multi-channel time-domain speech separation network is proposed based on fixed beams.Finally,the multi-channel time-domain speech separation algo-rithm is extended for unknown number of speakers based on joint detection of number of sources and estimation their directions of arrival.Simulation and experimental results have verified the effectiveness of the proposed multi-channel time domain speech separation algorithm.Additionally,an application system is designed for speech perception of multi-sources based on cascading the sound source localization and speech separation algorithms.The real time process-ing results have further validated the effectiveness of multi-channel speech separation algorithms proposed in this thesis for a small-scale microphone array.
Keywords/Search Tags:Deep learning, Direction of arrival estimation, Sound source counting, Speech sepa-ration, Small-scale array
PDF Full Text Request
Related items