Font Size: a A A

Research On Speaker Tracking Method Based On Audio And Video Fusion

Posted on:2019-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:X Y JiangFull Text:PDF
GTID:2428330545454765Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous development of science and technology,target tracking technology has drawn more and more attention.It has been widely used in video conferencing,intelligent robots and so on.It has become an important topic for contemporary scholars.Traditional target tracking only uses the information collected by a single type of sensor to complete tracking.However,the information of a single modality may be influenced by some factors lead to the accuracy of the tracking results and the robustness of the system are greatly reduced.For example,the noise of the environment,the reflection of obstructer and other factors will affect the sound source tracking system.The changes of the target movement posture in the video tracking,the target occlusion and other influencing factors will also interfere with the tracking effect.Therefore,the fusion of multimodality data represented by audio and video is used to improve the accuracy of speaker tracking.In this paper,the audio information obtained by the microphone array and the video information obtained by the camera are integrated under the framework of particle filtering,and the advantages of the two information sources are complemented to improve the accuracy of the speaker tracking result.In order to improve the accuracy of the results obtained by merging the audio and video information as a whole,this paper improves the traditional generalized cross-correlation algorithm and the particle filter algorithm for obtaining more precise delay results and video location information.First,a method based on Time Difference of Arrival(TDOA)is used for obtaining audio information,wherein the estimation of the delay result is the key to the accuracy of the entire tracking method..However,the performance of the traditional generalized cross-correlation time delay estimation algorithm for estimating delay results begins to decline under the condition of low SNR andreverberation.To solve this problem,this paper proposes an improved algorithm for generalized cross correlation delay estimation based on quadratic correlation.The method first filtering the received signal,then embeds the quadratic correlation algorithm into the generalized cross-correlation algorithm,and improves the weighting function.Experiments show that under the environment where noise and reverberation exist at the same time,the improved algorithm has obvious advantages in the estimation performance of delay.Secondly,particle filter is used in video target tracking.It is suitable for tracking complex environments.However,there is a problem of large amount of computation and degradation of particle diversity.The mean shift algorithm can change the position of the current point to the maximum of the probability density function through repeated iterations,so in this paper,the mean shift is embedded in the particle filter to improve the accuracy of the video tracking and the operation efficiency.In addition,when the model is built,the feature values with smaller probability are removed to reduce the interference of non-target pixels on tracking.Finally,linear resampling method is used to solve the problem of particle diversity degradation.After many experiments,the accuracy and efficiency of the improved algorithm are significantly improved compared with the traditional algorithm..Finally,the more accurate audio and video information obtained through the improved algorithm is integrated in the particle filter framework to track the speaker.Through many experiments,we can see that the new speaker tracking method based on audio and video fusion has a good tracking effect in the complex environment.
Keywords/Search Tags:Generalized Cross Correlation, Time Difference of Arrival(TDOA), Mean Shift, Particle Filter, Audio and video fusion
PDF Full Text Request
Related items