Font Size: a A A

Research On Visual Object Tracking Methods Based On Siamese Networks

Posted on:2021-08-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z WuFull Text:PDF
GTID:2518306050471614Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,the way of getting information is gradually through two-dimensional image develops into video sequence.As a branch of video object processing,visual object tracking technology has important application value in military,industrial and civil fields.Compared with other computer vision tasks,object tracking faces more challenges not only complex scenarios but also high real-time requirements.The traditional object tracking algorithms is hard to put into practice because of their simple processing scenes,low accuracy as well as poor real-time performance.In recent years,visual object tracking technology based on deep learning has attracted a lot of attention gradually because of its good performance.Among deep learning tracking algorithms,Siamese Network is one of the most typical and successful representatives.Siamese Network learns the similarity measure function by training trackers off-line and not update network weights during inference,it improves the real-time performance of the tracking technology significantly and get great achievements in recent years.The thesis focuses on the visual object tracking algorithms based on Siamese Networks.Based on the Fully-Convolutional Siamese Networks,aiming at the problems of inaccurate target boundary box location and weak generalization ability as well as ambiguous semantic information of SiamFC,the thesis improved SiamFC and analyzed the interpretability of the semantic information of Siamese Network.The main work and research results are as follows:1.In order to address the problems of inaccurate location of the bounding box,five different loss functions and three different mask matrixes are proposed.The thesis trains SiamFC on VID dataset,and evaluates on VOT2016 and OTB100 datasets.Experimental results show that Smooth L1 Loss combined with rectangle mask matrix is helpful to locate bounding box of the target,and the Combined Loss designed in this thesis is helpful to solve the imbalance problem of sample distribution.2.In order to address the problems of weak generalization ability and low tracking accuracy of SiamFC,two kinds of Backbone networks and Attention mechanism are used.In order to discuss the effectiveness of deep residual network as well as Attention mechanism in Siamese Network,the thesis uses Alexnet-5 and CIRes Net22 structures as Backbone network,by adding SE-Block and CBAM-Block attention modules into the Backbone network respectively.Experimental results show that Attention mechanism can effectively improve the generalization performance of SiamFC when added into deep residual network.In addition,generalization performance of SiamFC can also be improved by using deep residual network as Backbone network and increasing the training set size.3.In order to address the problem of ambiguous semantic information of Siamese Network,the thesis analyzes the interpretability of Siamese Network by visualizing neural network and quantifying features.For the first point,the thesis visualizes and analyzes the feature maps of each layer in SiamFC.For the second point,the thesis analyzes the response distribution of Siamese Network to different kinds of video sequences on VID and GOT10 K datasets.Experimental results show that the specific channel of Siamese Network has special response to the target of a specific category.In addition,different datasets,different classification standards of targets,and different data distribution may also affect the channel response of Siamese Network.Facing the problems of inaccurate target boundary box location,weak generalization ability and ambiguous semantic information of SiamFC,the thesis proposes detailed solution and analyzes the semantic information contained in SiamFC.Experimental results show that the algorithm proposed in the thesis can effectively improve the location accuracy and generalization ability of SiamFC.What's more,by explaining the interpretability of Siamese Network,this thesis lays a foundation for the interpretability research in the field of visual object tracking.
Keywords/Search Tags:Deep Learning, Siamese Network, Visual Object Tracking, Attention mechanism, Interpretability
PDF Full Text Request
Related items