Font Size: a A A

Based On Tdoa Sound Source Localization And Video Information Fusion Research

Posted on:2011-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:X Z MaFull Text:PDF
GTID:2208360305473805Subject:Computer software theory
Abstract/Summary:PDF Full Text Request
Speaker localization and tracking is a hot research topic in human-computer interaction. It has applications in fields such as multimedia systems, video conference systems, video monitoring systems, and intelligent robotics etc. Due to high noises and severe reverberation in real-world environments, speaker localization and tracking based on audio information is a challenge topic. This thesis focuses on robust methods for speaker localization based on audio information and speaker tracking combining audio and video information.First, we introduce classical methods for estimating time delay of arrival (TDOA) between a microphone pair and for TDOA estimation based on multi-channel correlation coefficient using multiple microphone pairs. Traditional TDOA estimation methods may become invalid due to noises and reverberation in real-world applications. The audio signal may contain valid valid or invalid time frames. This thesis paper models the probability density of activeness in circular microphone arrays by which the valid and invalid frames are distinguished. Then we propose a RANSAC algorithm using TDOA estimation for robust speak localization based on valid frames. Compared with the traditional methods, the proposed method achieves greater robustness and better accurateness. Furthermore, we introduce the mean shift algorithm for object tracking using video information. Finally, we propose to track speaker using distributed Kalman filter that uses both audio and video information. Its advantage is that by using both cues, but the shortcoming of single modality is avoided while both modalities complement to each other, and as such, the speaker locations can be positioned more accurately.
Keywords/Search Tags:Microphone Array, Source localization, Audio-Video tracking, RANSAC algorithm
PDF Full Text Request
Related items