Font Size: a A A

Model Design And Optimization Of Deep Learning For Visual Tracking

Posted on:2018-03-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:D HuFull Text:PDF
GTID:1368330563496306Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Visual tracking was one of the hot research topic in the field of computer vision,and it was widely used from the civilian field of the intelligent video monitoring and human-machine interaction to the military field of precision weapons and unmanned aerial vehicles guidance.Visual tracking technology has been studied for a longer time,however,in the practical complex environment,it was still a challenging task to continuously and steadily track dynamic changing target for a long time,namely,robust visual tracking.The main reasons were the change of the target itself(such as target shape and posture change)in the tracking process as well as the influence of external environment(illumination,scale,occlusion,and so on)on the target appearance.Hence,the establishment of the target model was an important foundation to realize robust tracking.In recent years,target feature extraction technology based on deep learning were highly valued by academic circles and industrial circles.It can automatically mine multi-layer representation reflecting data nature by layer-by-layer nonlinear transformation as well as had stronger learning ability and effective feature expression ability.Compared with the traditionally manual selection features relying on the priori knowledge of designers,the target model of deep learning had stronger adaptability and popularization,and it has been greatly successful in the image recognition field.The visual tracking was similar to the image recognition in the target modeling and search,so the deep network structure of offline pre-training was directly used for the online tracking tasks in the existing researches and thus achieving better tracking effect.In consideration of the applicability of different models for the pre-training dataset with or without label information as well as the performance requirements of robust visual tracking,this paper studied the design and optimization of deep learning models for tracking application in combination with visual tracking knowledge,designed the deep model structure,network training mechanism and feature use method of deep model according to the supervised and unsupervised learning models.Thus the optimization technology for the use of model was studied from accuracy and timeliness.The main work and results were as follows.(1)Visual tracking framework based on deep learning was constructed,and the key factors affecting the deep model were systematically analyzed.The learning feature had stronger universality based on the rich semantic information in the large sample space.The visual tracking framework,which used the deep learning model with the offline training for large-scale dataset for the online tracking task,was constructed.According to the feature learning ability,the key technologies and parameters affecting the deep network performance were analyzed from four aspects of function activation,regularization,data enhancement,network depth,which provided the theoretical basis for the design of deep learning model.(2)Visual tracking method based on multi-layer fusion of convolutional neural network was proposed,and the generative model was combined with the discriminative model to improve the visual tracking effect,especially,the adaptability for target occlusion.Furthermore,the generative model was based on CNN convolutional layers fusion,and the adaptive principal component analysis with fragment was used to determine the target occlusion.Simultaneously,the discriminative model could use the advantages of CNN fully-connected layer in the global extraction feature and invariance to effectively distinguish the foreground and background and improve the classification accuracy under a complex environment by joint optimization of feature expression and classifier.It was shown in the experiments on Object Tracking Benchmark(OTB)that compared with the existing algorithm HCFT based on the better visual tracking effect,the method in this paper could increase the tracking accuracy and success rate of tracking by 45.49% and 37.16%,respectively,as well as by 6.04% and 6.60%,respectively,under the occlusion scene.(3)Visual tracking method based on the local convolutional restricted Boltzmann machine and clustering driven fine-tuning was proposed.Because the RBM fully-connected structure cannot use the image structure information,the local convolution and probability pooling were inserted into the RBM model to decrease the superfluous parameters and simplify the model structure.In order to fully use the category information distributing in the unlabeled sample space,the network training optimization was realized by the clustering analysis of high-layer feature extraction.It was shown in the experiments on OTB that compared with the visual tracking algorithm DLT based on the unsupervised deep model,the method in this paper could not only significantly decrease the superfluous parameters of model but also increase the tracking accuracy and success rate of tracking by 45.49% and 37.16%,respectively.(4)Visual tracking optimization method based on the ensemble learning of deep neural network was proposed to further increase the tracking accuracy.Multi-network collaborative training method based on negative correlation method was put forward according to the degradation of model otherness caused by multiple deep network training by online tracking small sample dataset to improve the individual difference.Moreover,according to sensitivity of AdaBoost framework to noise,the updating rules and methods of sample weight were improved to improve the noise robustness.The experimental results on OTB showed that the tracking accuracy and success rate of tracking could be increased by 2.59% and 4.03%,respectively,by the ensemble method based on the single model algorithm in this paper.(5)Visual tracking optimization method based on the deep hash learning was proposed to further improve the timeliness of tracking.It was ensured to construct the compactness,sensibility to contents and robustness of hash code by the joint optimization of CNN feature representation and binary coding with similarity control,equilibrium maintenance and independence maximization.It was shown in the experiments on OTB that the method in this paper could not only ensure the tracking accuracy but also make the tracking rate reach 2.22 times as fast as that directly based on the deep characteristics.Moreover,compared with traditional hash methods based on manual-designed feature,the tracking precision and tracking accuracy rate were increased by 9.84% and 4.14%,respectively,by the method in this paper.
Keywords/Search Tags:Deep learning, Visual tracking, Convolutional neural network, Restricted Boltzmann machine, Ensemble learning, Hash learning
PDF Full Text Request
Related items