| With the wide application of panoramic vision in the fields of visual surveillance,intelligent transportation and virtual reality,there is a growing demand for the detection of vehicles,pedestrians and other targets in panoramic images.Panoramic video is a high-resolution video image based on a panoramic image that provides information about the scene within a 360° field of view of the acquisition device,with both a flat display and a three-dimensional display effect.In practice,panoramic video images are often projected as 2D flat rectangular images,which can cause distortion and object deformation,resulting in conventional detection algorithms not being able to fully extract the feature information of the target within the image and causing target misses and false detections.In this thesis,the object detection algorithm in panoramic video is studied based on the above problems,and the main work is as follows:(1)A panoramic video detection model based on deformable convolution and feature fusion is proposed to address the problem of large variation in target deformation and size in equal rectangular panoramic video images.The deformable convolution is used in the model backbone network to improve the feature extraction capability for deformed targets,and the multi-scale feature fusion method with improved PANet(Path Aggregation Network)+SPP(Spatial Pyramid Pooling)module is used to improve the feature extraction capability for targets of different scales by adding jump connections to re-fuse the original features with the fused features.In addition,a panoramic image dataset is constructed to train and test the model.Experimental results show that the model has the highest detection accuracy and faster detection speed compared with other models,and is able to detect deformation targets in panoramic videos quickly and accurately.(2)The panoramic stereo video is a binocular panoramic video image after equal rectangular projection,and there are differences in the degree of occlusion of the same target in this image.To address this problem,a panoramic stereo video detection model based on feature fusion and attention mechanism is proposed.The model first feeds the input binocular panoramic images separately into two feature extraction networks with shared parameters,then splices and fuses the two feature maps after feature extraction to integrate the feature information in the two video images,followed by the CBAM(Convolutional Block Attention Module)module to enhance the attention to the same target features in the upper and lower panoramic images,and finally feeds them into prediction.At the same time,a binocular panoramic image dataset is constructed and annotated to train and test the model.The experimental results show that the model is more suitable for detection tasks in panoramic stereo video compared to other comparison models,and also demonstrate the effectiveness of using the two-way feature fusion approach and the CBAM module in the model.(3)The proposed panoramic stereo video detection model is deployed on the cloud forwarding side of the live panoramic stereo video system and tested in different scenarios.The results show that the detection accuracy of the model is above 90%and the average latency of the system after deployment is 4.3s,which verifies the practicality of the panoramic stereo video detection model in the system. |