Visual object tracking is one of the basic research issues in the field of computer vision.The purpose is to track and mark the interested objects in the continuous video sequence,and output the position and scale information of the interested objects.With the rapid development of deep learning,object tracking is widely used in intelligent transportation,unmanned driving,video monitoring,and other fields.Therefore,object tracking has a good scientific research value and practical application value.However,visual object tracking still faces great challenges,such as various kinds of targets and complex tracking environment,which greatly reduces the efficiency of object tracking.Therefore,how to design robust object tracking algorithm to achieve accurate tracking in complex environment is an problem needed to be solved.In order to improve the robustness of object tracking algorithm in complex environment,we uses the deep reinforcement learning method to design and optimize the motion module,appearance module and update module in the object tracking,and proposes three novel object tracking algorithms.The three algorithms are studied from three aspects:expanding the space of tracking action selection,improving the ability to distinguish the object and updating the efficiency of tracking algorithm.The tracking algorithm can effectively improve the tracking robustness in complex environment.The main research work and innovations are as follows:(1)Aiming at the moving module of object tracking algorithm,an adaptive exploration network with variant reduced network(AEVRNet)is proposed.Based on the optimization of motion module,the convergence and expansion of action search space are accelerated by nonconvex optimization and adaptive action exploration strategy.First,inspired by the upper bound of the combined confidence,this dissentation designs an adaptive exploration strategy,which uses time and space knowledge to explore the effective actions and jump out of local optima.Secondly,the tracking problem is defined as a nonconvex problem.The nonconvex optimization is introduced into the random variance reduction gradient as the reverse propagation method,which makes the tracking algorithm converge faster and reduce the loss.Finally,we define a novel action reward loss function based on regression,which is more sensitive to object state and can retain more object feature.The experimental results show that the algorithm can effectively improve the efficiency of moving module search.(2)Aiming at the appearance module of object tracking algorithm,a robust real-time deep reinforcement learning and tracking algorithm Noisy OTNet(Noisy Object Tracking Network)is proposed.Based on the optimization of the appearance module,the tracking problem is defined as the deep reinforcement learning problem with parameter space noise.Firstly,a noise network based on the deep deterministic strategy gradient of parameter noise is designed to better match the object tracking task and predict the tracking results directly.Secondly,in order to improve the tracking accuracy in complex conditions,for example,in the case of fast motion and deformation,this paper proposes an adaptive update strategy,which can obtain the spatiotemporal information of the target based on the upper confidence boundary algorithm,and improve the efficiency of model updating.In addition,this paper designs a new algorithm based on incremental learning for the recovery of lost targets.Finally,the ability of tracking algorithm is improved and achieves robust tracking results in complex environment.The experimental results show that the algorithm can effectively improve the performance of the appearance module.(3)Aiming at the updating module of object tracking algorithm,a new deep reinforcement learning and tracking algorithm STKTMM(Student Teacher Knowledge Transfer Based Multi-Task Multi-Model Tracker)is proposed.Based on the optimization of the update module,the tracking performance of the model based on different tasks is improved by using domain knowledge of specific tasks.Different teacher models are designed to train offline based on corresponding tasks.The teacher model of a specific task guides the online tracking student model on the corresponding task,and improves the ability of the student model to distinguish the object.In addition,a multi-buffer strategy is designed to prevent the student tracker from forgetting the old knowledge when learning new knowledge.Finally,an adaptive online tracking updating method based on knowledge transfer is proposed.The online student model is updated by using the network parameters of the teacher model and online student model,which improves the efficiency of updating the student model.The experimental results show that the algorithm can effectively improve the update efficiency of the update module. |