As the number of automobiles continues to increase,road safety has become a topic of great concern.Pedestrians,as a vulnerable group of traffic participants,often suffer the most in traffic accidents.The development of artificial intelligence technology provides new solutions for vehicle driving and traffic safety.This paper proposes a method that utilizes deep learning with artificial intelligence to predict pedestrian trajectories,thereby reducing the probability of harm caused by vehicles to pedestrians.The YOLO(You Only Look Once)algorithm is currently one of the better performing single-stage object detection algorithms.However,its accuracy in object boundary regression is low and is difficult to apply in scenarios with high requirements for predicting box intersection-over-union,such as when pedestrians overlap.Therefore,this paper introduces a Transformer module based on the fifth generation of YOLO(YOLOv5)to leverage its attention mechanism for spatiotemporal information association.This results in a pedestrian object detection algorithm based on YOLOv5 that improves regression accuracy and performs well in detecting highly dense occlusion.To avoid the problem of increased model complexity and reduced real-time performance due to the introduction of the Transformer module,the paper introduces a CARAFE module to lightweight the network.The experiments show that the improved network reduces the parameter volume by 27.83% compared to the previous version,greatly reducing the overall computation of the improved model.Through this improvement,the proposed pedestrian object detection algorithm increases both m AP@0.5 and m AP@0.5:0.95 by 19%,despite a slight increase in model complexity.The improved YOLOv5 pedestrian detection model is embedded into the Deep Sort tracking algorithm to optimize its pedestrian detection and capture capabilities,resulting in better performance for subsequent tracking tasks.To address the issue of re-association after object occlusion in the baseline Deep Sort algorithm,the OSNet network is introduced as a re-identification module.With this module,the proposed algorithm retains the ID of objects even under occlusion and changes the way Deep SORT handles unmatched trajectories,resulting in complete and accurate historical trajectories for each pedestrian in the scene.Finally,this paper proposes a denoising diffusion generative adversarial network(GAN)that utilizes the pedestrian’s historical trajectory obtained from the pedestrian tracking algorithm in Chapter 3 to infer its future trajectory.The denoising diffusion model is highly regarded for its sampling quality and diversity,but it is difficult to apply in practical scenarios due to its high training cost and slow sampling speed.The proposed denoising diffusion GAN utilizes a multi-modal conditional GAN to model each denoising step,thereby reducing the total number of denoising steps and addressing the slow sampling problem.Moreover,the paper considers the interaction of all elements in the scene and designs a social attention module to capture the relative importance of all agents in the scene on the pedestrian trajectory through improved relative positional dot-product attention,resulting in generating reasonable multimodal predicted trajectories.Extensive evaluations,including ablation experiments and comparative experiments,show that the proposed denoising diffusion GAN outperforms the baseline GAN by more than 30% in terms of ADE and FDE indicators in pedestrian trajectory prediction tasks on the ETH/UCY and JAAD datasets.The generated trajectory diversity is also significantly improved. |