Font Size: a A A

Research On Multi-object Tracking Algorithm Based On Deep Learning

Posted on:2023-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:X ChenFull Text:PDF
GTID:2558307061461964Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Computer vision is a popular research field in artificial intelligence and has important applications in autonomous driving,motion recognition,and human-computer interaction.Multiobject tracking is an important task in the field of computer vision.Its main idea is to extract the reID feature of the target through object detection and re-identification tasks.Finally,online tracking is performed on the detected target in each frame of the video,so that the target is connected to the existing trajectory.Although the research on multi-object tracking has made some progress,there are still some common problems,such as the multi-object tracking process over-relying on semantic features and ignoring the visual features of the target,resulting in frequent switching of target identities.In order to solve some problems in the field of multi-object tracking,this paper mainly focuses on the multi-object tracking algorithm based on deep learning.The main work is as follows:1.A multi-object tracking algorithm based on deep aggregated high-resolution networks is proposed.The backbone network in the existing Fair MOT algorithm performs cross-resolved multiscale aggregation to obtain detection information and re-ID features.However,it does not further extract deep semantic information.Based on Fair MOT,this paper proposes a multi-target tracking algorithm equipped with a deep aggregated high-resolution network.The backbone network of the tracker extracts the abstract semantic feature map through the DLA network and then inputs it into the improved lightweight HRNet structure for further cross-resolving multi-scale aggregation to extract deeper semantic information,thereby improving the tracker’s performance.Finally,the tracker computes the cosine distances of the re-ID feature vectors of all objects and the Io U of the detection boxes,and then connects the detected objects in the current frame to the existing trajectories.The tracker then uses Kalman Filtering to further estimate the positions of all objects in the current frame.Experiments show that the recognition rate of this tracker is better than most trackers on the benchmark dataset.2.A multi-object tracking algorithm based on deep path aggregation network is proposed.In order to improve the tracking speed in the case of multiple targets,this paper proposes a multi-object tracking algorithm based on deep path aggregation network.Most of the existing network structures only focus on semantic features,while ignoring the target spatial feature information.The backbone network of the algorithm uses the bottom-up feature map path enhancement structure to extract accurate target spatial feature information,thereby enhancing the feature extraction capability of the network and making the predicted target location more accurate.In the final stage of the backbone network,we further concatenate multiple feature maps in parallel to maintain a rich high-resolution representation.Experimental results show that the tracker outperforms most multi-object trackers on several public datasets,and has the advantage of fast real-time tracking.3.A multi-object tracking algorithm based on convolutional block attention module is proposed.Existing multi-object tracking algorithms are limited by the strong dependence of the backbone network on semantic features,and do not pay too much attention to the visual features of objects in images,resulting in a lot of identity switching during the tracking process,especially in dense scenes.To this end,this paper proposes a novel network structure by introducing the CBAM module into the DLA network.The network pays attention to the visual features of the target through the CBAM module,and performs multi-scale aggregation across resolutions for the visual features of each target in the image.The proposed network can focus on extracting visual features in input images,and then further extract semantic information through multi-scale aggregation,thereby extracting more discriminative re-ID features.Experiments show that trackers equipped with this network structure can effectively reduce the number of identity switching.
Keywords/Search Tags:Deep learning, Multi-object tracking, Deep aggregation high-resolution, Deep path aggregation, Convolutional block attention module
PDF Full Text Request
Related items