Font Size: a A A

Some Improvements Of Video Visual Tracking In Complex Scene

Posted on:2019-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:H X LinFull Text:PDF
GTID:2428330575450817Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Object tracking technology can determine,locate and predict the target of interest in the video sequence,provide an important solution to further advanced semantic understanding of video such as behavior analysis and have a wide range of applications for video surveillance,driverless,human-computer interaction and medical diagnosis.However,due to the complexity of the environment and the variability of target's shape or movement in the real life,it leads to the tracker easy to lose and drift.To this end,this paper studies and discusses the complex factors that target tracking faces,and mainly focuses on the impacts,including occlusion,deformation and fast motion on the process of tracking.Through the qualitative analysis,it is found that feature expression and motion state estimation are effective approaches.Therefore,the main research objective of this paper is to extract useful features and improve precision of state estimation,and the object and timing information in the video are taken as the principal research object.The representation model is constructed through the target information to extract the discriminative representation of the candidate area and match.By timing and target's location information,the motion prediction model is built to get the accurate object's location.First of all,the Siamese network has the characteristics of a combination of feature extraction and feature matching.With the help of this trait,this paper presents a moving object feature extraction and object matching method based on Siamese network.In detail,on the one hand,a fully-convolutional network structure is adopted to accommodate different size of input,and discriminative feature is obtained by layer-wise operations.On the other hand,the correlation layer is proposed for computing the similarity between the object and the current frame,at the same time outputs the response map of the matching coefficient between the target and the current frame on each location.Furthermore,add the ROI Pooling layer for visual tracking.Secondly,in order to improve the accuracy of the target position prediction across frames,this paper presents a tracking method based on long short-term memory.In detail,the pre-trained model was utilized to extract current image's feature.Combining with feature representation and center location of the object in the last frame,the LSTM module was proposed to learn the object location relationships between time sequences.Then,based on LSTM,the object matching model is merged to realize the similarity comparison of candidates and the object,furthermore final object location in the current frame was gotten.By introducing the correlation layer and the ROI Pooling layer,the tracker based on the Siamese network has good results on public tracking benchmark.It not only shows that the feature extraction method based on the Siamese network has good generalization ability,but also adding the correlation layer and the ROI Pooling layer can effectively match and locate the target.Furthermore,by introducing a motion model based on a recurrent neural network,the experimental results show that tracking performance has improved significantly,indicating our method can adapt to the displacement caused by rapid target movement and effectively capture the occluded target.
Keywords/Search Tags:feature extraction, object tracking, object matching, location prediction
PDF Full Text Request
Related items