Font Size: a A A

Research On Multi-modal Multi-target Tracking Method Based On UAV Platform

Posted on:2024-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:X W ZhaoFull Text:PDF
GTID:2542307106965349Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Visual multi-target tracking technology based on UAV platform has received a lot of attention due to its high efficiency and flexibility.However,there are still many challenges in applying multi-target tracking technology to UAV missions.The first is the small number and scale of multi-target tracking data sets based on UAV platform,which limits the development of multi-target tracking methods based on UAV platform.Secondly,the targets captured by the UAV-based platform are generally small,and the available feature information is less,which limits the performance of the algorithm.And the multi-target tracking algorithm based on a single visible light mode cannot correctly image the target when the light is insufficient and the weather conditions are bad,which makes the target tracking become extremely difficult.To solve the above problems,this thesis constructs a large-scale multimodal multi-target tracking dataset based on the UAV platform,and proposes a plug-and-play multi-scale feature enhancement module and a network framework for multimodal multi-target tracking.The main research elements of this thesis are as follows:(1)A large-scale multi-modal multi-target tracking dataset(M3UAV)based on UAV platform is constructed.Firstly,the multi-target tracking algorithm based on a single visible light mode will have the problem of target imaging blur in low visibility and some adverse weather conditions,and the imaging in the thermal infrared mode can overcome the influence of the above factors and make up for the shortcomings of visible light imaging.Therefore,in view of the scarcity of data sets and the limitation of single-modal data,we capture a large number of multi-modal data including visible light modal(RGB)and Thermal modal(Thermal)by UAV and label them.In order to be closer to real application scenarios and ensure the generalization of data,a total of 20 multi-modal video sequences are collected in the construction process,with a total number of more than 25070 frames and an average video sequence length of 1253 frames.(2)A plug-and-play Multi-scale feature enhancement module(MSFEM)is proposed.Due to the video captured based on the UAV platform,it has the problems of small target imaging and large scale span between different targets.In this thesis,a plug-and-play multi-scale feature enhancement module is proposed.First,by conducting convolution operations at 3×3,5×5 and 7×7 scales on the original features.And than an adaptive feature aggregation module is introduced.Finally,skip connections are added to avoid the loss of effective information in the original features with the superposition of convolution operations.The module is embedded into different multi-target tracking algorithms to solve the problem of limited available feature information caused by the small size of UAV imaging.And in the overall network framework,the image in the thermal infrared mode is added to the feature extraction of the single modal algorithm based on visible light,and the feature extraction is carried out at the same time and the extracted two modal features are adaptively fused to overcome the problem that the multi-target tracking algorithm cannot fully identify the target in the single mode.(3)It is verified that the proposed multi-scale feature enhancement module and the introduction of multi-modal data can effectively improve the accuracy of multi-object tracking algorithm.In this thesis,eight standard evaluation indicators(MOTA,MOTP,Rcll,Prcn,Frag,TP,FP,FN)are used to comprehensively measure the tracking results of the algorithm,so as to verify the effectiveness of the dataset.And we embed the proposed module into multiple algorithms for comparative experiments.The results show that the application of the multi-scale feature enhancement module improves the MOTA of the original best algorithm from 62.148% to 64.173%.And in order to verify the effectiveness of the module structure design and multimodal data input,we further carried out ablation experiments.In the experimental results,the MOTA of the best algorithm under multimodal data input was increased from 59.008% to 61.618% compared with the single modal data.All these experimental results strongly demonstrate the effectiveness of the dataset constructed in this thesis and the proposed method.
Keywords/Search Tags:Multi-object tracking, UAV platform, Multi-scale feature fusion, Bench-mark dataset
PDF Full Text Request
Related items