Font Size: a A A

Research On Image 3D Object Detection Algorithm Based On Deep Learnin

Posted on:2024-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y WuFull Text:PDF
GTID:2568307130472524Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of deep learning techniques,image 3D object detection has been widely used in computer vision fields such as autonomous driving,mobile robotics and VR.As one of the most important sensors in environmental perception systems,cameras are cheaper to acquire and process than data generated by other sensor devices,and the image data not only provide rich semantic and geometric information,but also implicitly contain depth information in 3D space.Therefore,the study of deep learning-based image 3D object detection is of great theoretical significance and research value for the low-cost deployment of selfdriving vehicles and autonomous mobile robots.A lightweight and efficient 3D object detection algorithm for stereo images is proposed to address the problems of high computational complexity and large number of network model parameters in the process of deep feature extraction and characterisation of image 3D object detection based on deep learning,and high requirements for hardware resources such as GPU.Deep aggregation structure,anchor point prior and channel similarity reweighting are applied to reduce the parameters of the network model while maintaining the richness of the extracted features;the combination of multiple attention mechanisms enables the network to pay more attention to detailed features.Experimental results show that the model balances detection speed and accuracy well and outperforms most other algorithms with a single GPU and consuming only 8GB GPU memory.A stereo 3D object detection algorithm based on the visual intersection sparse attention mechanism is proposed to address the vulnerability of image-based 3D object detection to optical environment factors as well as the viewpoint occlusion and visual blurring problems faced by binocular vision 3D object detection.In addition,a cross-view intersection neuron structure is designed to enhance the information interaction between the left and right views and a sparse Transformer structure is applied to explore the intrinsic connection between the left and right views to reduce the strong dependence of the network on the dual views.Experimental results on the KITTI dataset show that the network model has good detection accuracy and speed,is robust to viewpoint interference,and can still achieve effective detection tasks when the right view is obscured.
Keywords/Search Tags:3D object detection, Deep learning, Transformer, Attention mechanism, Data augmentation
PDF Full Text Request
Related items