| With the rapid development of video satellite technology in recent years,satellite videos captured by video satellites have become a research scene in the field of object tracking.Satellite videos can monitor dynamic targets through continuous observation of the same area on the ground.Satellite video object tracking has broad application prospects,such as real-time monitoring of ground traffic,rapid response to natural disasters,navigation security and military operations.Satellite videos have the characteristics of low picture contrast and definition,sparse target features and textures,and high similarity between target and background,which bring a lot of challenges to satellite video object tracking.Most of the current satellite video tracking methods use traditional hand-crafted features,which cannot well cope with the challenges brought by satellite videos.In recent years,deep learning has shown great advantages in ordinary video object tracking with its powerful feature extraction ability,but its application in satellite video object tracking is still less.By studying the characteristics of satellite videos,this paper uses the powerful feature representation ability of deep learning to construct a two-stream convolutional neural network,which is composed of a deep Siamese network and a deep motion regression network,to utilize the appearance features and motion features respectively to achieve robust object tracking.First,the initial tracking results are obtained by using the appearance features and a simple and efficient Siamese framework.The motion features utilized by the deep motion regression network can greatly improve the algorithm’s ability to distinguish similar objects and make up for the lack of the appearance features.In this paper,an adaptive fusion strategy is used to effectively fuse the tracking results obtained by using the appearance features and motion features respectively.In addition,this paper also proposes a trajectory fitting motion model to use the historical trajectory of the target to further alleviate model drift and improve the robustness of tracking.The trajectory fitting motion model uses the long-term historical trajectory of the target to capture the motion pattern of the target and form the motion constraints of the target,so as to predict the position of the target more accurately.This paper also adopts an adaptive fusion strategy to effectively fuse the predicted results from the trajectory fitting motion model and the whole neural network to achieve robust and accurate object tracking.In this paper,a satellite video dataset for object tracking with a large number of targets and many target categories is constructed.Comparisons between the proposed algorithm and other representative algorithms prove that the proposed algorithm has superior tracking performance in satellite videos;Ablation analysis experiments of each part of the proposed algorithm verify the effectiveness of each part of our algorithm and demonstrate the contribution of each part to our algorithm. |