Real-time Object Detection Based On Cascaded Neural Network

Posted on:2020-12-16

Degree:Master

Type:Thesis

Country:China

Candidate:X Z Ma

Full Text:PDF

GTID:2428330596982428

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

In recently years,with the development of technologies in computer vision and deep learning,numerous impressive methods are proposed for accurate 2D object detection.However,beyond getting 2D bounding box or pixel masks,3D object detection is eagerly in demand in many applications such as autonomous driving and robotic applications because it can describe objects in a more realistic way.Because LiDAR provide reliable depth information that can be used to accurately localize objects and characterize their shapes,many approaches use LiDAR point cloud as their input,and get impressive detection results in autonomous driving scenarios.In contrast,some other studies are devoted to replace the LiDAR with cheaper cameras,which are readily available in daily life.As LiDAR is much more expensive and inspired by the remarkable progress in image-based depth prediction techniques,this paper focuses on the high performance detection of 3D object utilizing only monocular images.In this paper,we propose a monocular 3D detection framework in the domain of autonomous driving.Unlike previous image-based methods which focus on RGB features extracted from 2D images,our method solves this problem in the reconstructed 3D space in order to exploit 3D context explicitly.To this end,we first leverage a standalone module to transform the input data from 3D image plane to 3D point cloud space for a better representation,then we perform the 3D detection using PointNet backbone net to obtain objects' 3D locations,dimensions and orientations.To enhance the discriminative capability of point clouds,we also propose a multi-modal features fusion module to embed the complementary RGB cue into the generated point cloud representation.We argue that it is more effective to infer 3D bounding boxes from the generated 3D scene space(i.e.X,Y,X space)compared to the image plane(i.e.R,G,B image plane).Evaluation on the challenging KITTI dataset shows they our approach boosts the performance of state-of-the-art monocular approach by a large margin,i.e.,around 15% absolute AP on both 3D localization and detection tasks for Car category at 0.7 IoU threshold.

Keywords/Search Tags:

3D object detection, outdoor scene, autonomous driving, data representation

PDF Full Text Request

Related items

1	Semantic Image Segmentation And Object Detection In Autonomous-Driving System
2	Deep Learning 3D Object Detection
3	Research On Object Detection Methods In Outdoor Street Scene
4	Research On 3D Object Detection Based On RGB And LIDAR Data
5	The Models Of Lane Detection And Semantic Segmentation Applied To Autonomous Driving
6	3D Object Representation And Detection In Complex Scene
7	The Research And Development Of The Object Detection And Subdivision System Based On TensorFlow Framework
8	Extracting Cognition out of Images for the Purpose of Autonomous Driving
9	Deep Learning-based 3D Object Detection In Point Cloud
10	Research On 3D Object Detection Algorithms Based On Monocular Vision