Font Size: a A A

Research Of Unsupervised Object Tracking Algorithm Based On Forward-backward Tracking Verification

Posted on:2022-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:X HanFull Text:PDF
GTID:2518306740994429Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
Object tracking is an important task in the field of computer vision.It has a wide range of applications in automatic flight of unmanned aerial vehicle,precision guidance,mobile robot,intelligent video surveillance,intelligent transportation system and so on.In recent years,most of the trackers are trained with supervised learning,which requires a lot of manual labeling information.The effectiveness of the tracking algorithm based on supervised learning is restricted by the quantity and quality of manually annotated data.At the same time,it requires a large amount of labor cost to produce high-quality object tracking data set,and relying on a large amount of human experience to provide supervision does not conform to human expectations of artificial intelligence.In order to solve the above problems,this thesis focuses on the unsupervised learning method for object tracking and explore the utilization method of the natural data by the neural network,so that the tracking algorithm can use the natural video data and independently form an understanding of the data.In this thesis,unsupervised object tracking models are constructed based on siamese network,region proposal network,video prediction network and the idea of forward-backward tracking verification.The research content of this thesis mainly includes the following aspects.The siamese network and the region proposal network have been widely used in the field of object detection.The Siam RPN algorithm regards the object tracking task as a frame-byframe object detection task.After the siamese network is used to extract the features of the target,the region proposal network is used to find and track the target.However,Siam RPN is a supervised tracking algorithm that requires a lot of manually annotated data.In this thesis,an unsupervised object tracking model is constructed based on the advantages of feature extraction and candidate target screening of Siam RPN structure,combined with the idea of forward-andbackward tracking verification.The results of experiments on the OTB100 dataset show that the model has a strong tracking ability with an accuracy of 66.9% and a success rate of 42.1%and can accurately predict the central position of the target.The model in this thesis has achieved similar effect to the supervised learning algorithm when solving five kinds of interference problems,such as illumination change,background speckle,dynamic blur,target disappearance and low resolution,and has strong real-time performance.Video prediction usually makes full pixel prediction of subsequent video frames according to the current several video frames.Previous work has realized subsequent video frame prediction on natural data without manual label.Based on the video prediction module of MAST,this thesis constructs an unsupervised object tracking model by using the idea of forward-andbackward tracking verification.Experiments on OTB100 and DAVIS 2017 data sets show that the video prediction network can learn the dynamic features required by the object tracking problem from the target.The model has certain tracking ability,and the prediction results of larger targets are more accurate.The model can accurately predict the center point of the target in a short time,and the mean of the difference between the predicted center point and the real center point is less than 20 pixels.This thesis also analyzes the difference between video prediction and object tracking.In the construction of unsupervised tracking model based on video prediction,if threechannel information of RGB image is used as input,there will be more redundant information from three channels with strong correlation,which will greatly reduce the video prediction ability of the network.By setting the information bottleneck and selecting the appropriate color model and channel,the redundant information of the neural network input can be reduced,and the key features of the image can be retained,so as to improve the training effect of the model.In this thesis,the channel numerical distribution of three color models,RGB,HSV and Lab,is quantitatively analyzed,and a variety of channel selection methods of different color models are used to train and test on MAST.The experimental results on the Davis 2017 dataset show that the color model with strong correlation between channels can not achieve the desired effect when one channel is retained to remove the redundant information in the training process,which leads to the decline of the network training effect.The correlation between channels of RGB is strong,and the video prediction effect is poor.Lab and HSV channels are relatively independent,and the video prediction effect is relatively good.The best prediction results can be obtained by randomly using the a and b channels of the Lab model,and the contour accuracy of video prediction can reach 64.2% and the similarity can reach 59.1%.
Keywords/Search Tags:Object Tracking, Unsupervised Learning, Video Prediction
PDF Full Text Request
Related items