Font Size: a A A

Visual Object Tracking Based On Few-shot Learning

Posted on:2023-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:L HuFull Text:PDF
GTID:2568306776478334Subject:Engineering
Abstract/Summary:PDF Full Text Request
Visual object tracking is an important research topic in computer vision,which is widely used in many fields.A robust tracker often faces many challenges.By pre-training on large data sets,the object tracking algorithm based on deep learning enables the tracker to initialize the model by using the labeled first frame of the tracking video,and predict the location and size of the tracked target in the subsequent frames.As the driving force for the success of deep learning based tracker,people pay more and more attention to building large-scale tracking data sets.However,accurate labeling of tracking data is labor intensive and costly in practical application.This paper intends to combine few-shot learning with object tracking algorithm,including using few-shot learning method to improve the performance of tracker and A data augmentation algorithm for object tracking task is designed,which can use few-shot labeled data to learn a robust tracker.The main research contents are as follows:(1)Real-Time multi domain tracking algorithm based on Discriminant attention R2D2-MDNet.In order to improve the fast domain adaptability of real-time multi domain target tracking network(RT-MDNet)in the online tracking stage,firstly,optimize the network structure,use deeper backbone network and finer Ro I pooling layer to extract richer deep semantic features.Secondly,the ridge regression attention module is added,and the few-shot learning algorithm R2D2 is used to quickly solve the discriminant channel attention vector to enhance the discrimination of foreground and background features.The number of iterative training required in the model initialization and update stages is less,which has increased by 2.6 and1.6 percentage points respectively on the OTB-100 and UAV-123 benchmark data sets.(2)Construct a few-annotation tracking benchmark(FAT).Fat is constructed by randomly sampling one or more frames of each video from three large video tracking data sets: Tracking Net,GOT-10 k and La SOT.Compared with the commonly used object tracking training data sets,FAT contains rich sample categories and less labeled images,small storage space and fast data sharing.Using FAT to train the object tracking model and test it on the test set can evaluate the performance of the tracking algorithm under a small amount of training data and the effectiveness of the new object tracking data augmentation method.(3)Design a data augmentation strategy AMMC suitable for object tracking.AMMC uses augmentation methods to Mimic object motion change,firstly cuts out the tracked target and performs a sequence of transforms to simulate the possible change by object motion.Then the transformed targets are pasted on the inpainted background images and further conjointly augmented to mimic variability caused by camera motion.Compared with standard augmentation methods,AMMC explicitly considers tracking data characteristics,which synthesizes more valid data for object tracking.We extensively evaluate our approach with ATOM and Di MP on the FAT datasets.The experimental results show that under the same experimental setting,the proposed method uses 2 % training data set to obtain the same or even better performance as the tracker trained on the complete labeled data set.
Keywords/Search Tags:Object Tracking, Few-Shot Learning, Data Augmentation, Few-annotation dataset
PDF Full Text Request
Related items