Font Size: a A A

Research On Target Tracking Algorithm Based On Fully Convolutional Siamese Network

Posted on:2021-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:K YangFull Text:PDF
GTID:2518306452474044Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Although the tracking algorithms based on correlation filtering and deep learning have made great progress in recent years,they still face the problem that tracking speed and tracking accuracy cannot be satisfied at the same time,which makes it difficult to apply in practical scenarios.The problem of arbitrary object tracking is that the traditional Siamese network tracking algorithm has limited model discrimination capabilities due to its shallow network and limited representation capabilities.To address these issues,in this paper,we mainly study the tracking algorithm based on the fully convolutional Siamese network.The main contribution of this thesis is summarized as follows:(1)Firstly,the backbone network utilizes a modified VGG network that is more expressive and suitable for the target tracking task in this paper.Then,a novel dual attention mechanism is applied on the middle layer of the network to dynamically extract features in order to extract useful information and suppressing interference information to improve discriminant ability of the model.Then,we combine a deeper network with a shallow one to take full advantage of the features from different layers and apply spatial and channel-wise attentions on different layers to better capture visual attentions on multi-level semantic abstractions,which is helpful to enhance the discriminative capacity of the model.Furthermore,the top-layer feature maps have low resolution that may affect localization accuracy if each feature is treated independently.To address this issue,a non-local attention module is also adopted on the top layer to force the network to pay more attention to the structural dependency of features at all locations during off-line training.Extensive evaluations demonstrate that our tracker has achieved favorable results while having a speed of 60 fps on 1080 ti GPU on the OTB and VOT real-time experiments,respectively.Our tracker with high accuracy and real-time speed can be applied to numerous vision applications.(2)None of the above methods use more advanced network frameworks such as Res Net and Inception.The straightforward replacement can even cause substantial performance drops.The main reason is that the network zero-padding for convolution violates the fully-convolutional property and induces a positional bias in learning.To address these issues,in this paper,we use the cropping-inside residual units to cut out features affected by zero-padding,and propose a lightweight yet effective feature agglomeration module(FAM)to adaptively fuse low-level and high-level features for robust tracking.Extensive evaluations on OTB and VOT challenge demonstrate that the proposed tracker consistently achieves favourable performance against several state-of-the-art trackers and runs at 50 fps.
Keywords/Search Tags:Visual tracking, Deep learning, Siamese networks, Attention mechanism
PDF Full Text Request
Related items