Font Size: a A A

Speech Localization Research Based On Microphone Array In Adverse Acoustic Environments

Posted on:2010-01-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:1118360302460478Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Microphone array has been applied in many fields such as speech enhancement, speaker recognition, video conferencing and so on. Speaker localization as the foundation of spatial filtering and acoustic processing is a key component in array signal processing. Speaker localization based on microphone array is classified as time delay based localization and angle based localization. Angle based localization with applications to narrow band and stable signal is sensitive to source model and sensor model, and due to heavy burden in computation, it is not fit for speaker localization. Time delay based localization is insensitive to source and sensor, and for light burden in computation it is wildly applied in speaker localization systems. Traditional time delay based localization algorithm aims to ideal free-field acoustic model, but in many practical circumstances such as audio and video conferencing, the localization system suffers complex acoustic environments, which cause traditional localization algorithm failed. In contrast with traditional localization circumstances, microphone array speech localization systems encounter more complex acoustic environments including room reverberation, colored noise, spatial noise, microphone position turbulence, non-Gaussian noise.Around this complex circumstance, aiming at three functions of localization system includeing time delay estimation, source localization and speech activity detection, we proposed some algorithms, realized localization in complex circumstance. The main contributions are as follow:(1) A frame of anti-noise blind channel identification is proposed to improve the anti-noise performance of the traditional blind channel identification algorithm. As the frame's special case, a two channel identification algorithm Lag-EVD is developed. The algorithm can reduce colored noise better since a lagged covariance matrix is used.(2) Aiming at reverberation and colored noise, using anti-noise two channel blind channel identification as criterion, a robust adaptive time delay estimation algorithm Lag-AEDA is proposed. The algorithm restrains reverberation by estimating room impulse response, and uses lagged covariance matrix to reduce colored noise. So the algorithm can estimate time delay robustly in reverberant and noisy case.(3) Aiming at reverberation and spatial noise, an adaptive time delay estimation algorithm based on triple microphone is proposed. The algorithm separates spatial noise by identifying a double input triple output system, and uses lagged covariance matrix to reduce spatial noise.(4) A 3D source localization algorithm-LCTLS is proposed by taking both TDE error and microphone location error into account. The algorithm can give robust location estimate, because it adopts total least square criterion to reduce microphone location error, and takes quadratic constraint of position parameters into account(5) A new VAD algorithm based on higher order statistics of linear prediction residual is proposed to discriminate familiar non-Gaussian noises in video conference such as applause, cough and knock. The algorithm utilizes the difference between the number of harmonics speech and non-Gaussian noise to discriminate them. By using the normalized kurtosis as discrimination criterion, the algorithm can effectively discriminate speech and non-Gaussian noise.
Keywords/Search Tags:Microphone Array, Time Delay Estimation, Source Localization, Speech Activity Detection, Reverberation, Colored Noise, Spatial Noise, Non-Gaussian Noise, Blind Channel Idetification, Total Least Square
PDF Full Text Request
Related items