Autonomous driving is one of the important application areas of artificial intelligence,and it has significant importance for the development of intelligent transportation and smart cities.Detection and tracking of vehicles and pedestrians are essential key technologies that can help vehicles perceive the surrounding environment in real-time and make accurate driving decisions.With the development of artificial intelligence technology,using deep learning for object detection and tracking has become a hot research topic.However,current research based on deep learning is mainly focused on visible light images,and there is relatively little research on infrared images.Infrared images,due to their imaging method being less affected by lighting conditions,not only can work during the day like visible light but also have good imaging effects in the night or foggy weather.Compared with visible light images,infrared images have fewer features,which requires higher demands for the model’s feature mining.At the same time,infrared images have weaker contrast,which requires higher capability of model to distinguish objects.Therefore,infrared image detection and tracking are more challenging.With the continuous development of deep learning technology,the accuracy of object detection and tracking has also been continuously improved,but the improvement in accuracy often comes with a demand for higher computing power.Some models excessively pursue accuracy and ignore real-time performance,making it difficult to truly apply them in the industrial field.Therefore,this thesis studies object detection and tracking in infrared vehicle scenes.The main work of this thesis includes the following two parts:(1)Based on the YOLOv5 algorithm,this thesis proposes multiple improvements.Firstly,the Ghost module is introduced to improve the backbone network,and the GhostCSP is designed to replace the CSP structure in YOLOv5s.The GhostNeck is also designed to replace the standard 3×3 convolution,making the model more lightweight.Secondly,the Channel Attention Mechanism(SENet)and Coordinate Attention Mechanism(CA)are respectively fused behind the Ghost Module in the GhostNeck network,and the CA module with higher detection accuracy improvement is selected and integrated into the detection model to improve the detection performance.Furthermore,the Channel-shuffle structure in Shuffle Net-v2 is introduced to replace the last 1×1 convolution in the Neck network,further improving the model’s running speed.Finally,a data augmentation method specific to certain scenarios is adopted to enhance the model’s detection capability.Experimental results show that the improved model achieves an average precision(mAP)of 78.7% on the dataset in this thesis,with a detection speed of 39.8FPS.Compared with the YOLOv5 s model,the detection speed of the improved model is increased by 69.4% while maintaining basic accuracy.Therefore,the improved detection model in this thesis achieves a good balance between accuracy and speed.(2)This thesis applies a detection-based multi-object tracking algorithm to pedestrian and vehicle tracking in infrared vehicle scenes,and makes the following improvements to the DeepSORT tracking algorithm.Firstly,the improved detection model is integrated into the multi-object tracking model to improve the model’s running speed.Then,a pedestrian and vehicle tracking dataset in infrared vehicle scenes is established,and the deep appearance model is retrained to better extract object features,thereby improving the object tracking effect.Experimental results show that both the MOTA and MOTP indicators of the improved algorithm have increased,and IDs have also decreased to some extent.Therefore,the improved tracking model in this thesis can effectively track objects in infrared vehicle scenes. |