Multi-speaker Tracking Based On Audio-video Information Fusion In Smart Environment

Posted on:2012-07-06

Degree:Master

Type:Thesis

Country:China

Candidate:J R Zheng

Full Text:PDF

GTID:2178330335466800

Subject:Control theory and control engineering

Abstract/Summary:

Human brain helps people track and identify things accurately in a complex environment, by integrating all the senses from multi-source sensory organs. In the smart environment, speaker tracking is a major research area of human-computer interaction. Now, how to make full use of multi-modal sensor information including the same speaker's voice and video image data to achieve robust and accurate tracking performance, through taking example by the brain's integration mechanism, is drawing more concern of researchers in heterogeneous information fusion.After summarizing and introducing the basic theory and research status of multi-source information fusion, video tracking, sound source location and filtering algorithm, two kinds of novel human tracking algorithm based on multi-source information fusion are proposed. One is multi-person tracking based on multiple video feature information fusion, and the other is speaker tracking based on the audio-video information fusion.Skin color is used because of its anti-rotation and anti-block properties in multiple video feature fusion based person tracking system, and the color likelihood model is constructed by color histogram. Moreover, the edge gradient search strategy is utilized to get contour likelihood model, using the characteristics for contour to represent the shape of the target. Finally, both color and contour information are integrated in a particle filter framework to keep tracking multiple persons.In an audio-video fusion based speaker tracking system, combined with the complementarity of voice and video images from a homology speaker, microphone time delay based sound source localization information and mean-shift based color information are used separately to establish audio model and video model. Then the IPF is utilized as a tool to create fusion likelihood model as well as the fusion importance function from which particles are sampled. A closed-loop processing framework, in which the feedback process is introduced, is adopted to improve the tracking accuracy and completeness.Experiments using real world data show that the proposed two information fusion based tracking algorithm is feasible. Multiple video feature information fusion based multi-person tracking algorithm is robust at light change and background clutter interference. While audio-video fusion based speaker tracking approach can accurately track the conference's main spokesman, and have good tracking performance even there exists speaker movement, posture changes and other complex cases.

Keywords/Search Tags:

audio-video, heterogeneous sensor fusion, target tracking, mean shift, Sound Source Localization, skin color histogram, importance particle filter

Related items

1	Design And Implementation Of Moving Target Tracking System Based On Video And Audio Information Fusion
2	Research On Multi-source Target Fusion Tracking Method Based On Joint Histogram Representation
3	The Research On The Target Localization And Tracking Based On WSN
4	Based On Tdoa Sound Source Localization And Video Information Fusion Research
5	Taeget Tracking Algorithm Based On Particle Filter
6	Mean Shift Particle Filter Video Target Tracking Algorithm Based On Color And Depth Features
7	Research On Speaker Tracking Method Based On Audio And Video Fusion
8	Hand Tracking And Recognition Technology Research For Human-computer Interaction Based On Depth And Skin Feature Fusion
9	Multi-Target Sound Source Localization And Tracking Based On Microphone Array In Near-Field Environment
10	Algorithm Of Audio And Visual Fusion For Localization And Tracking Based On Audio Auxiliary Information