Font Size: a A A

Research And Application Of 3D Object Detection Algorithm Based On Pseudo-Lidar

Posted on:2022-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:B J WuFull Text:PDF
GTID:2518306764980229Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
At present,in scenes such as automatic driving,robots,and augmented reality,only2 D bounding boxes are not satisfact for the task.We also need the position,size and rotation angle of the object in the 3D space.The current high-precision 3D object detection algorithms mostly rely on Li DAR,stereo camera or depth camera,but Li DAR is extremely expensive.Stereo cameras have higher requirements for the scene,and depth cameras can not be used in outdoor scenes.However,due to its simplicity,ease of installation,the 3D object detection method based on monocular image is very challenging and practical.However,due to the lack of depth information,there is still much room for improvement in the detection and positioning accuracy of 3D object,especially for the detection of occlusion,truncation,and detection of long-distance targets and small targets.In response to this problem,thesis proposes a 3D object detection technology solution based on monocular vision.What is important is the input is only RGB images and camera parameters.In this solution,the monocular depth estimation module is used to estimate the depth information of the RGB image and output the full resolution depth image.Then the depth map is converted into a point cloud representation through the camera matrix and conversion formula.To reconstruct the generated point cloud,thesis introduces the multi-task shared encoder network and the layered sampler(SSIS)module with increased spacing to reconstruct the pseudo point cloud(Pseudo Li DAR).For the multi-task shared encoder network,the two sub-tasks of depth estimation and2 D object detection,a shared encoder is used to extract features,then the extracted features are input into the subsequent two sub-networks.For SSIS,the depth map is converted into Pseudo Li DAR,then divided according to the foreground and background,and sample the point cloud at different distance intervals.Secondly,the thesis studies how to use image features combined with a flat-head view frustum point cloud to predict 3D box proposal.The thesis proposes a point cloud segmentation method based on prior knowledge.This method considers the center point depth,offset of the two-dimensional detection box and the average size length of various targets.According to this method,a threshold is determined,and the points above the threshold are defined as background points.Then two detection algorithm improvement strategies are proposed,one is 3D detection correction network(3DC),the relevant features of the2 D object detection stage are fused in the detection stage;the other is Attention Fusion Module(AFM),the network embed image features into point cloud features.Finally,the two tasks(depth estimation and 3D object detection)based on the pseudo Li DAR method are trained separately.Different training targets lead to inconsistent loss which is unnecessary when our goal is improving detection accuracy.thesis proposes an endto-end training learning scheme for predicting accurate 3D bounding box.We have implement experiments on these methods respectively to verify the effectiveness of the methods.Based on the task background of monocular images,the paper uses the proposed improvement to propose a 3D object detection algorithm based on monocular vision,which is designed and implemented in the training and testing stages.Experiments are compared with algorithms such as AM3 D show that our algorithm proposed in thesis can effectively improve detection accuracy and reduce computational complexity.
Keywords/Search Tags:Deep Neural Networks, 3D object detection, Pseudo LiDAR, Monocular Image
PDF Full Text Request
Related items