Font Size: a A A

Research On Visual Tracking Based On Deep Structural Feature Representation Learning

Posted on:2021-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:A J ZhouFull Text:PDF
GTID:2428330620965591Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Visual tracking aims to track a given target in a video sequence.It is a hot research topic in the field of computer vision and has received wide attention.With the rapid development of deep learning,the powerful ability of feature extraction for neural network makes great progress in visual tracking.However,visual tracking still faces many challenges,such as background noise,object deformation and fast motion,and illumination variation.In order to make the tracker more robust,this thesis studies feature representation learning for the target in deep neural networks.This thesis proposes a visual tracking method based on weakly supervised feature representation learning and a visual tracking method based on graph convolutional feature representation learning.Details are as follows.First,in view of the problem that the feature description of the target is not robust enough,and the model drift is caused by background noise when the model update,etc.,we propose a visual tracking method based on weakly supervised feature representation learning.We first introduce the class activation mapping,learn the most discriminative feature in the target bounding box with weakly supervised localization,and obtain the feature by the weight matrix calculation.Meanwhile,the unimportant and trivial regions in the bounding box are suppressed,so we obtained the original samples and the samples with trivial regions suppressed.Then,we use two heterogeneous neural networks to learn the heterogeneous features of the two samples,which makes the classifier more robust.This work also proposes a simple and effective data augmentation strategy,that is,the weight matrix is used to identify the most discriminative features of the bounding box and cover up these features to make the classifier pay more attention to the detailed features of the target.This data augmentation strategy not only alleviates the imbalance between positive and negative samples,but also improves the robustness of the classifier.Experiments on several public data sets verify the effectiveness of the proposed method.Secondly,a visual tracking method based on graph convolutional feature representation learning is proposed.By referring to current popular graph convolutional neural network,we first use the features extracted from the traditional convolutional neural network to construct an adjacency matrix,and then learn the structured features of the target by the graph convolutional network to enhance the target description feature description.Because heterogeneous features have different effects on tracking results,the attention mechanism is used to select adaptive features to highlight the features that are beneficial to the tracker.Experiments on OTB-2015 and TC-128 data sets prove that the proposed method has better results for challenges of similar objects and object deformation.
Keywords/Search Tags:Visual tracking, Weakly supervised localization, Attention mechanism, Data augmentation, Graph convolutional network
PDF Full Text Request
Related items