Font Size: a A A

Sound Source Localization Using A Microphone Array For Indoor Far Field Interaction

Posted on:2020-08-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Z FangFull Text:PDF
GTID:1488306512481744Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Microphone array-based sound source localization technology is widely utilized in multiple fields ranging from national defense and security to video conference.Especially under the development of far-field voice interaction system with smart home and intelligent office as the main indoor scenes,accurate and steady sound source localization is an essential prerequisite for high quality speech enhancement and voice recognition.In practical applications,there exist multiple indoor far-field sound sources simultaneously,making the acoustic environment quite complicated.The environment noise,multipath propagation and multi-source interference and other negative factors will bring challenges to the accurate estimation of the sound source location and parameters.In response to the above problems,this paper engages in the research of microphone array-based indoor far-field sound source localization technology,the main innovation points and works are as follows:1)To address the multipath propagation and single sound source localization problems of the indoor scenes,this paper proposes a steady and unrestricted blind channel identification algorithm NLMS-CR which has a relatively high localization accuracy under strong reverberant environment and low signal to noise ratio(SNR): this algorithm combines the channel cross-correlation(CR)characteristics and normalized minimum mean square error(NLMS)self-adaptation filter.Through structural optimization,this algorithm can avoid the weight vector constant modulus restrictions of the existing adaptive eigenvalue decomposition(AED),and in turn,increases the convergence speed and robustness of its sound source localization algorithm.The experiment results shown that under relatively strong reverberant environments,the blind channel identification method can obtain a more accurate sound source localization compared with cross-correlation methods with certain computational costs.The NLMS-CR method is an AED method based on the blind channel identification principle,and it can bring greater accuracy rate in a shorter time under the same parameter settings.2)In order to solve the low cost,high resolution and multi-sound source localization problems,this paper proposes a multi-dimensional cumulative angle spectrum construction method KDEMS which possesses robust spatial domain fuzzy inhibitory characteristics under reverberant environment and wide array spacing;the method combines the normalized cross-power spectrum of the observed signal of the microphone array with the approximate Gaussian kernel function based on the time-frequency sparsity and short-term orthogonal characteristics(W-DO)of the speech signal,and uses multi-stage sub-band processing.The high-frequency spatial ambiguity problem with wide spacing of array elements is effectively solved,and then the KDEMS angular spectrum function with higher spatial resolution capability is obtained by time-frequency multi-dimensional accumulation and based on fewer array elements.The experimental results show that when the spacing of array elements is narrow,the angular spectrum function faces the problem of low resolution and estimation accuracy,and the spatial ambiguity caused by the widening of the array element spacing will also lead to the appearance of multiple angular spectrum pseudo peaks.Compared with the angle spectrum formed by the cross-correlation method,the proposed KDEMS can effectively suppress the angular spectrum pseudo-peak and multi-stage sub-band processing for pseudo-peaks while obtaining good spatial domain identification by using low-pass weighting factors,and bring more efficient suppression effect.While providing good recognition,it also provides more robust multi-source localization performance.3)Aiming at the low SNR and the time-varying and unknown number of sound sources in indoor multi-source localization,a joint estimation algorithm KDEMSW-MP with multiple correct and stable sound sources is proposed.Based on the KDEMS angle spectrum function,the algorithm introduces time-frequency domain filtering modules such as local SNR tracking and coherence detection to extract time-frequency support intervals where each active sound source is less affected by environmental noise and multi-source mutual interference.The waveform distortion problem of KDEMS angle spectrum function is suppressed,and then the double-width matching tracking(MP)method is introduced to replace the traditional peak search,which improves the joint estimation accuracy of the position and quantity of subsequent multiple sound sources.The experimental results show that the time-frequency domain filtering of KDEMS can effectively alleviate the waveform distortion of the spectral function and suppress the amplitude of the pseudo-peak.This gain is more obvious when the signal-to-noise ratio is low and the number of sound sources is large.The search method has an overestimation when there are many sound sources,which leads to the decrease of the joint estimation performance.The double-width matching tracking method for the angle spectrum function is used to perform the inner product maximum search,which can effectively alleviate the overestimation of the number of sound sources in the existing methods.The above measures ensure that KDEMSW-MP is a relatively robust joint estimation method for multiple sound source locations and quantities.
Keywords/Search Tags:Microphone array, Sound Source localization, TDOA estimation, Blind channel identification, Angular spectrum
PDF Full Text Request
Related items