Font Size: a A A

Target Tracking Based On Audio And Visual Fusion

Posted on:2015-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:P QinFull Text:PDF
GTID:2298330431490283Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Target tracking technology is widely used in many areas such as military target trackingand recognition, intelligent monitoring, intelligent traffic control and so on. Traditionaltracking methods utilize single information source to locate and track the target, for example,only audio or visual information. Separate audio track is easily affected by noise andenvironmental reverberation, separate visual track is susceptible to occlusion, lighting,background and other effects, both of them have a poor robustness. Therefore, in order toimprove the robustness, people take advantage of the principle of human brain integratesspatial and temporal information about audio and video, to fuse the audio and visualinformation for target tracking.Multi-target tracking algorithm is divided into three part, target location, data associationand audio-visual fusion algorithm. First, we summarize the methods currently used for audioand visual localization, and introduce the SRP-PHAT algorithm and the backgroundsubtraction method based on Gaussian mixture model in HSV color space.Second, we do a deep research on multi-target data association. Since the JPDAalgorithm doesn’t perform well when the target distance is very close in indoor environment,we propose to convert the data association problem to the trajectory recognition andallocation problem, and utilize the HMM model for data association. The proposed methodimproves the association accuracy of close distance trajectory. However, there are often noobservation cases caused by the video frame loss or the occlusion, and due to the timecorrelation of audio signal is weak, speech data often loss during transmission or training. Inthese cases, the HMM model doesn’t perform well. To overcome this disadvantage, we usethe AHMM model for data association. The AHMM model modify the HMM structure byadding a jump arc, and use the new adding time stamp to associate the observation sequencewith the state sequence. The AHMM model can handle the case of irregular, incompletesample or missing observations situation effectively, and reduce the association error of theincomplete trajectory greatly. Compared to the JPDAalgorithm, the proposed data associationalgorithms based on HMM and AHMM, both have a less calculation and time consumingwhich facilitates the real-time implementation.Then, we propose an iterative decoding algorithm based on AHMM model, and utilizethe algorithm to fuse the audio-visual information for target tracking. The method performswell in low SNR environment without any prior information, and uses the complementary ofthe audio and visual information to improve the tracking accuracy of the incomplete data case.Finally, we apply the AHMM model to the trajectory recognition. Multi-target trajectoryobservations are obtained using background subtraction method in which the backgroundmodel is generated in HSV color space. To ensure the validity of the comparison between theAHMM model and the HMM model, the same initial parameter set is adopted in EMalgorithms for each method. Experiments indicate that the AHMM model performs better thanthe HMM model in the recognition of the incomplete trajectory.
Keywords/Search Tags:Audio-Visual fusion, target tracking, data association, AHMM model
PDF Full Text Request
Related items