| Visual object tracking is one of the important and challenging research topics in the field of computer vision.By acquiring the state and information of a target in the first frame of a video,its motion state is predicted in subsequent video sequences.Due to its wide range of application scenarios(video surveillance,augmented reality,smart cities,smart driving,etc.),object tracking has received much attention within and outside the industry.With the upgrading of computer hardware and the proposal of large-scale data sets,object tracking algorithms with superior performance continue to emerge.However,realistic tracking scenarios are often more complex and object tracking still faces many challenges.It is still important to propose a robust tracking algorithm.The challenges of object tracking come from both the target itself and the complex and changing environment around it.When tracking in real world scenarios,the size and shape of the target is constantly changing,and the target can be subject to motion blur or changes in lighting.The obstruction between the lens and the target during tracking and the low resolution of the images captured by the lens make object tracking even more challenging.To address these problems,this paper delves into the theoretical knowledge of deep learning and Siamese networks,learns the framework of existing object tracking algorithms,and proposes two object tracking algorithms based on Siamese networks,the main work is as follows:A Siamese Anchor-Free Object Tracker with Multiscale Spatial Attentions is proposed for object tracking.The network uses the modified ResNet-50 as backbone for feature extraction to extract multi-scale features.More adequate feature information is extracted by modifying the step size of the convolutional kernel in the backbone.The spatial attention block is designed to extract spatial attention features to better extract the contextual information of the target.We also adopt an anchor-free classification and regression module to transform the target tracking problem into a classification and regression problem,thus avoiding the impact of hyperparameters about the anchor frame.We obtained advanced performance on four large publicly available datasets,including OTB100,UAV123,VOT2016 and GOT10k.An Anchor-free Siamese Tracker with Attention and Corner Mechanisms(STACM)is proposed.We combine the anchor-free framework with the Siamese network for object tracking,and obtained the prediction result directly by calculating the position of the top-left and bottomright corners,which can minimize the influence of hyperparameters and human factors.In the feature extraction phase,a mask is extracted from the first frame of the target image while the features are extracted using the ResNet-50 network,thus highlighting more the relationship between the target and the background in the initial frame.We propose a spatial and channel attention module,and fuse it into our tracker,which can increase the awareness of our proposed model for the contextual information.STACM achieves advanced performance on four large publicly available datasets,OTB100,UAV123,VOT2016 and GOT10k. |