Font Size: a A A

Research On Monocular 3D Object Detection Algorithm Based On Depth Estimation

Posted on:2024-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:P J GuoFull Text:PDF
GTID:2542307112454304Subject:Radio Physics
Abstract/Summary:PDF Full Text Request
In recent years,research on key technologies involved in autonomous driving systems has received widespread attention,and breakthroughs have been made in all aspects of research,relying on rapidly developing artificial intelligence technologies such as deep learning.As an important component of the autonomous driving system,the sensing system is mainly responsible for providing accurate scene information to the autonomous vehicles.Among them,3D object detection is the most basic and important research direction of the perception system,aiming to obtain information on the category,location,3D scale and motion pose of key 3D objects around the selfdriving vehicle.The monocular camera has more advantages over LIDAR,binocular camera and depth camera because of its simple structure,wide range of application and lower cost,so the 3D object detection method based on monocular vision is more in line with the practical application requirements.However,due to the lack of depth information in monocular images,the accuracy of 3D object detection needs to be improved.To address the problem of missing depth information in monocular vision,this paper improves on the existing monocular depth estimation and monocular 3D object detection algorithms to study the implementation of monocular 3D object detection algorithms based on depth estimation.The main research contents of this paper are as follows:(1)To address the problem of feature element information loss due to encoder pooling or strided convolutional downsampling in unsupervised monocular depth estimation methods based on Struct from Motion(Sf M),we propose a monocular depth estimation algorithm based on dimensional transformation in this paper.The algorithm improves the encoder-decoder network by using a packing block designed based on 3D convolution in the encoder part to perform the downsampling operation on the feature map,and introduces the method of sub-pixel convolution to collapse the image from the spatial dimension to the feature channel dimension to reduce the feature element information loss.Symmetrically,the original resolution is recovered in the decoder using the up-sampling operation with the unpacking block to achieve minimal resolution loss and improve the performance of depth estimation.Experimental results on the KITTI dataset show that the proposed method improves the performance of unsupervised monocular depth estimation significantly and approaches the performance of supervised monocular depth estimation methods.(2)To address the problem of data representation of depth map transformed into3 D spatial information in monocular 3D object detection algorithms based on depth estimation,this paper proposes a monocular 3D object detection algorithm based on depth information assistance.The algorithm is based on AM3 D,a 3D object detection network for pseudo-LIDAR point clouds,and represents the 3D spatial information converted from the depth map as image blocks,and uses a Convolutional Neural Network(CNN)with a channel attention mechanism to extract deep features from the image blocks to achieve 3D object detection.Experimental results on the KITTI 3D dataset show that the proposed method is effective in 3D object detection and outperforms the AM3 D monocular 3D object detection algorithm based on pseudoLIDAR point clouds.
Keywords/Search Tags:Autonomous Driving, Deep Learning, Monocular Vision, Depth Estimation, 3D Object Detection
PDF Full Text Request
Related items