Font Size: a A A

Research On 3D Object Detection Technology Based On Depth Estimation

Posted on:2022-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:H X WangFull Text:PDF
GTID:2518306785476244Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
3D object detection is an indispensable research topic in the field of computer vision,which has broad application prospects in many fields,such as autonomous driving,robot,intelligent security and augmented reality.The current research methods are based on lidar,depth camera,binocular vision,multi vision and monocular vision.The methods of lidar,depth camera,binocular and multi vision can obtain the depth information of the target object for 3D object detection,but there are some shortcomings,such as expensive equipment,poor applicability in harsh environment,large amount of calculation and so on.Monocular vision method requires less equipment cost and environmental conditions,and it is more close to the actual situation to detect 3D objects by estimating the depth of a single image.This paper focuses on the problem in 3D object detection based on depth estimation.We have investigated many methods in the existing literature and find that monocular depth estimation method is easy to lose the feature information and location information of large eigenvalue elements,and the gap between 2D image representation and 3D spatial representation in 3D object detection.Research from these two issues:(1)The depth sequential regression algorithm in single order depth estimation uses the full image encoder to obtain the context information of global features.However,the whole image encoder uses local average pooling to average and simply copy all pixels,which will result in the loss of feature information and location information of large eigenvalue elements.Pointing at this problem,this paper proposes a monocular depth estimation algorithm based on CBAM(Convolutional Block Attention Module).CBAM is embedded into the depth ordinal regression network as a full image encoder.Channel attention module and spatial attention module are used to capture the context information of global features in turn,so as to improve the accuracy of monocular depth estimation.(2)Using the depth map obtained by monocular depth estimation to achieve 3D object detection.In order to make up for the defect of 2D image to 3D space representation,a 3D object detection algorithm based on depth guided local convolution is proposed.The algorithm fuses the SKNet(Selective Kernel Network)with the depth map features after the move operation,generates convolution kernel and acts on the 2D image locally to obtain the target features of multi-scale scales,and then guides the learning of feature representation from 2D image to 3D space,narrows the gap between 2D image and 3D space representation.Finally,using joint training to optimize the monocular depth estimation network and 3D object detection network to improve the accuracy of 3D object detection.
Keywords/Search Tags:Monocular 3D Object Detection, Depth Estimation, CBAM, Depth-guided Local Convolution, 2D-3D
PDF Full Text Request
Related items