Font Size: a A A

Research On Regression Loss Function And Visual Tracking Based On SiamRPN Network

Posted on:2022-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z H HuangFull Text:PDF
GTID:2518306488493584Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Video object tracking is an important research topic in the field of computer vision,which is widely used in visual navigation,video surveillance,human-computer interaction,and medical diagnosis.However,in the actual video scene,there are many challenges such as occlusion,illumination change and background interference,so it is difficult for the tracking algorithm to balance the results in terms of real-time and accuracy.In recent years,with the rapid development of pattern recognition and machine learning,visual object tracking algorithm aims to better adapt to the changes of object appearance by virtue of the efficient feature learning ability of deep network.At present,the application of object tracking based on deep learning has achieved many outstanding achievements.Among them,Siam RPN(Siamese&Region Proposal Network)regards the tracking task as the problem of similarity comparison between template frame and detection frame,which not only maintains the tracking speed,but also improves the accuracy,which provides a new opportunity for optimizing the depth model of target tracking.Therefore,based on Siam RPN,this paper focuses on the problem of improving the performance of target tracking.The main contents of this paper are as follows:(1)In order to solve the problem that 1-smooth regression loss of Siam RPN fails to deal with the gradient disappears when the bounding boxes are not overlapped,the Intersection over Union(IOU)regression method is introduced to calculate the overlap ratio between the target box and prediction box.The tracking results of the bounding box regression using the Generalized intersection over Union(GIOU)and the Distance intersection over Union(DIOU)are analyzed,they avoid the failure of gradient return by the loss function can not be derived when the bounding boxes are non-overlap,and has the tracking scale invariance.At the same time,the deep residual network Resnet-50 is used to replace the shallow Alexnet network of Siam RPN to extract features.By deepening the network,fusing low-level spatial features and high-level semantic information,and using high-resolution and low-resolution features to map learning targets,the candidate target area can be generated quickly.Experiments on OTB2015dataset show that the improved method has achieved certain tracking performance improvement.(2)The loss function of bounding box regression is calculated from the overlap area,center distance ratio and aspect ratio between the target box and the prediction box,and an improved Multiple Intersection over Union(MIOU)regression loss function algorithm is proposed.In this paper,the aspect ratio is redefined as the angle of the diagonal and the area of the bounding box.Therefore,it can not only complete the task of scale alignment between the target box and the prediction box,but also avoid the problem of opposite optimization direction of length and width and gradient explosion,moreover,it reduces the number of iterations in the training process.In the experiment,ILSVRC-VID training dataset is used to obtain the deepping tracking model,which is evaluated based on OTB2015 and VOT2016 databases.Experiments show that the proposed method can effectively deal with the challenging attributes such as occlusion,fast motion,motion blur,illumination variation and scale variation,and the tracking effect is better.
Keywords/Search Tags:target tracking, deep learning, SiamRPN, Multiple Intersection and Union(MIOU), loss function
PDF Full Text Request
Related items