Font Size: a A A

RGBT Single Object Tracking Based On Dual View Feature Learning

Posted on:2023-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:T L ZengFull Text:PDF
GTID:2558306845991299Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The task of computer visual object tracking is to predict the bounding box of the object in the whole video sequence under the condition that the first frame is given as a prior,which is widely applied in the fields of automatic driving,human-computer interaction,video surveillance and so on.With the development of science and technology,application of different kinds of sensing technology is more and more widely.The thermal infrared imaging camera can provide the thermal infrared information of the object,and it can provide the complementary information for object tracking in visible view to a certain extent.The multi-view tracker can achieve robust object tracking under adverse conditions such as occlusion and illumination change.On the basis of single view,multiview makes use of the consistency and complementarity of information of multiple views to eliminate the redundancy of information among views,so as to learn better feature representation.On this basis,this paper discusses and studies the problems encountered in the process of multi-view object tracking.A dual view dynamic feature fusion tracking network besed on RGB image and Thermal image is proposed.The specific contents are as follows:In order to further improve the effectiveness of the fusion of visible features and infrared features,a two-stage attention pooling fusion strategy is proposed.Spatial attention and channel attention are proposed to realize the adaptive calibration of channel and spatial for the convolution layer features of dual views.At the same time,bilinear pooling is used to fuse the features more finely.By performing bilinear pooling operation on the attention feature map of the two views through cross product,the information of the object in the two views can be gathered effectively,and the discrimination ability of the network to the object is enhanced.In the process of dual view object tracking,there are many challenges,and the tracker cannot dynamically predict the reliability of view online.A view aware dynamic filter generation network is proposed.In this paper,the view-aware dynamic filter generation algorithm is used to generate view-related dynamic convolution.The dynamic convolution is adaptive and can adapt to different inputs even after training,so as to enhance the communication between the visible view and the infrared view,and improve the robustness of the tracking network under different challenges.Finally,this paper conducts comparative experiments on the VOT-RGBT2019 dataset and RGB-T234 dataset,which verify the effectiveness of the method proposed in this paper.At the same time,ablation experiments are carried out on the VOT-RGBT2019 dataset and RGB-T234 dataset to verify the effectiveness of the two-stage attention pooling fusion strategy and the view-aware dynamic filter generation network.
Keywords/Search Tags:Object tracking, Deep learning, Attention mechanism, Bilinear pooling, Dynamic filter generation
PDF Full Text Request
Related items