Font Size: a A A

Visual Object Tracking Via Deep Learning And Regression Model

Posted on:2019-11-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:K ChenFull Text:PDF
GTID:1368330548955288Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years,visual object tracking has become one of the most important research topics in computer vision.In some common video sequence-based applications like video surveillance,humancomputer interaction,UAV control,etc.,it is very necessary to incorporate an accurate,efficient and robust visual tracking algorithm.Given the boundary box of an initial object to be tracked,a visual tracking algorithm needs to predict the accurate location and size of the object in subsequent image sequences.Unlike traditional visual tasks,the positive samples available for learning a discriminative model are quite limited.The object appearance can also be affected by many factors,such as illumination variation,occlusion,rotation,deformation and so on.To deal with these variations,a visual tracking algorithm with high discriminative ability is needed.In view of the difficulties of few positive samples and frequent variations in object appearance,the dissertation proposes four visual tracking algorithms based on regression model and deep learning,to address four specific problems in visual tracking.Specifically,the dissertation completed the following four kinds of visual object tracking algorithms:First,the dissertation proposes a regression model based on structural correlation filters for visual tracking.The original correlation filter model takes the whole object as input.For the situation where the object is occluded or has undergone severe deformation,the algorithm can not accurately track the object.To solve these problems,the dissertation proposes to divide the object into multiple image patches and adaptively adjust their weights in order to suppress the contribution of the occluded parts to predict the object location.The dissertation also proposes an object–background histogram model to enhance the object when it undergoes severe deformation.Based on the above two methods,the performance of the proposed algorithm has been significantly improved.Inspired by the popular deep convolutional neural network,the dissertation further proposes to combine the deep learning technology with the regression model and use a single-layer convolutional network to solve the regression model.The algorithm uses the forward propagation of the convolutional layer to compute the regression of the samples,and iteratively optimizes the regression coefficients by back propagating of training errors.Different from the traditional regression model based on cyclically shifted samples,the samples in the proposed algorithm are all extracted through a sliding window,and do not include additional background information,which can significantly improve the discriminative ability of the regression model.In order to deal with the imbalance of positive and negative samples during training,the dissertation also proposes a new weighted truncation loss function,which improves the robustness and convergence speed of the model.Regression models that is trained with only holistic object patches can not handle the situation when the object undergoes deformation and occlusion,and can not efficiently predict the object size.Therefore,this dissertation also designed a hierarchical convolution regression model for locating the object and predicting the object scale.The hierarchical convolution regression model includes a global regression model for object location and a texture regression model for predicting the object foreground.In addition,the dissertation also propose a Bayesian model based on the maximum a posteriori estimation,to directly predict the object size from the object foreground map.The proposed size estimation method significantly reduces the computational complexity of the algorithm and improves running speed.The above three tracking algorithms are all based on supervised regression models.Therefore,in the tracking process,the model parameters must be updated online,which significantly reduces the running speed.In order to address this problem,the dissertation also proposes a object tracking algorithm based on two-channel convolutional neural network.The convolutional model receives two image patches as input and outputs a two-dimensional heat map to indicate the most likely position of the object in the image.In this way,the visual object tracking problem is reformulated as a similarity measurement problem.Unlike most existing tracking algorithms,this model requires only one offline learning to track any object without additional online updates.As a result,the running speed of the algorithm can be very high.Extensive experiments show that the algorithm can run at up to 45 frames per second while achieving competitive tracking performance.The above four tracking algorithms proposed in the dissertation start from four different perspectives and combine the regression model with deep learning techniques to solve specific problems in visual object tracking.The dissertation also conducted extensive comparative experiments on several popular visual object tracking datasets.Experiment results show that the four proposed algorithms achieve outstanding tracking performance.
Keywords/Search Tags:Visual Object Tracking, Regression Model, Deep Learning, Kernelized Correlation Filters, Convolutional Regression, Siamese Networks
PDF Full Text Request
Related items