Research On 3D Object Detection Technology Based On Depth Estimation

Posted on:2022-06-28

Degree:Master

Type:Thesis

Country:China

Candidate:H X Wang

Full Text:PDF

GTID:2518306785476244

Subject:Automation Technology

Abstract/Summary:

PDF Full Text Request

3D object detection is an indispensable research topic in the field of computer vision,which has broad application prospects in many fields,such as autonomous driving,robot,intelligent security and augmented reality.The current research methods are based on lidar,depth camera,binocular vision,multi vision and monocular vision.The methods of lidar,depth camera,binocular and multi vision can obtain the depth information of the target object for 3D object detection,but there are some shortcomings,such as expensive equipment,poor applicability in harsh environment,large amount of calculation and so on.Monocular vision method requires less equipment cost and environmental conditions,and it is more close to the actual situation to detect 3D objects by estimating the depth of a single image.This paper focuses on the problem in 3D object detection based on depth estimation.We have investigated many methods in the existing literature and find that monocular depth estimation method is easy to lose the feature information and location information of large eigenvalue elements,and the gap between 2D image representation and 3D spatial representation in 3D object detection.Research from these two issues:(1)The depth sequential regression algorithm in single order depth estimation uses the full image encoder to obtain the context information of global features.However,the whole image encoder uses local average pooling to average and simply copy all pixels,which will result in the loss of feature information and location information of large eigenvalue elements.Pointing at this problem,this paper proposes a monocular depth estimation algorithm based on CBAM(Convolutional Block Attention Module).CBAM is embedded into the depth ordinal regression network as a full image encoder.Channel attention module and spatial attention module are used to capture the context information of global features in turn,so as to improve the accuracy of monocular depth estimation.(2)Using the depth map obtained by monocular depth estimation to achieve 3D object detection.In order to make up for the defect of 2D image to 3D space representation,a 3D object detection algorithm based on depth guided local convolution is proposed.The algorithm fuses the SKNet(Selective Kernel Network)with the depth map features after the move operation,generates convolution kernel and acts on the 2D image locally to obtain the target features of multi-scale scales,and then guides the learning of feature representation from 2D image to 3D space,narrows the gap between 2D image and 3D space representation.Finally,using joint training to optimize the monocular depth estimation network and 3D object detection network to improve the accuracy of 3D object detection.

Keywords/Search Tags:

Monocular 3D Object Detection, Depth Estimation, CBAM, Depth-guided Local Convolution, 2D-3D

PDF Full Text Request

Related items

1	Monocular Depth Estimation Based On Convolutional Neural Network
2	Monocular Depth Prediction:Algorithms And Applications
3	Monocular Depth Estimation And Depth Completion Based On Convolutional Neural Network
4	Depth Estimation From Monocular Image Based On Deep Convolutional Neural Networks
5	Research On Monocular Image Depth Estimation
6	Depth Recovery Of Monocular Video Based On Neural Convolution Networks
7	Research On Depth Estimation Algorithms For Monocular Image
8	Research On Object Classification And Detection Based On Depth Estimation
9	Depth Estimation Of Monocular Image Based On Deep Learning
10	Depth Map Repair And Monocular Depth Estimation Based On Convolutional Neural Network