Research On Point-voxel-based LiDAR 3D Object Detection

Posted on:2023-05-19

Degree:Master

Type:Thesis

Country:China

Candidate:N Song

Full Text:PDF

GTID:2568306767463454

Subject:Photogrammetry and Remote Sensing

Abstract/Summary:

With the rapid development of 3D perception devices such as structured light sensors,Li DAR and Kinect,there has been more and more research in related fields,and practical applications,such as virtual reality,augmented reality,robotics and autonomous driving,have also received more and more attention.As a crucial task of3 D scene perception and understanding,3D object detection plays an important role in the above-mentioned extensive applications,and is the cornerstone of many downstream tasks,such as motion and tracking.Compared with indoor detection,sophisticated street scenes and emergencies in Li DAR 3D object detection tasks,and consequently related models have to meet stricter efficiency and accuracy requirements.At present,Li DAR 3D object detection mainly relies on point clouds data scanned by Li DAR,which is in a round view and sparse.Existing Li DAR 3D object detection methods can be generally divided into two categories according to the perception ways,namely,the methods based on voxel perception and those based on point clouds perception.Voxel representations contribute to locating objects efficiently and quickly,while point clouds representations can describe spatial relationships within objects to assist to refine detected objects.This thesis aims to utilize and combine the characteristics and advantages of these two representations,and propose a novel twostage Li DAR 3D object detection network,called Joint Point-Voxel Network(JPV-Net).Specifically,the network framework includes the dual encoder-fusion decoder proposed in this thesis,which consists of two encoders with different functions and designs,and a feature fusion decoder.The former are used to extract the rough voxels features of the3 D scene and point features rich in geometric context,while the latter adopts an attention mechanism to fuse the two features from coarse to fine and gradually propagates features back to the original resolution.Besides,in order to further explore the characteristics of the voxel CNN(Convolutional Neural Network)and the point cloud perception network,this thesis also designs two Io U(Intersection over Union)estimation modules for the proposal and refinement stages,both of which can effectively alleviate the disparity between object localization and classification confidence.In addition,although there are a large number of 3D scene samples gathered by driving vehicles,the cost of labeling 3D scenes is much higher than that of images and other types.Therefore,for Li DAR 3D object detection tasks,there are a large number of unlabeled samples rich in valuable information.To make full use of these samples,this thesis introduces a semi-supervised learning method to improve existing networks.In this thesis,multiple benchmark datasets,such as KITTI dataset and ONCE dataset,are used to evaluate the JPV-Net,and experimental results not only validate the effectiveness of the network,but also demonstrate the rationality of semi-supervised learning.

Keywords/Search Tags:

autonomous driving, 3D object detection, semi-supervised learning, multi-perception fusion

Related items

1	Research On 3D Environment Perception Algorithm Based On Multi-modal Sensor Fusion
2	Semantic Image Segmentation And Object Detection In Autonomous-Driving System
3	Research On Semi-Supervised Salient Object Detection Based On Deep Learning
4	Research On Deep Neural Network For Object Detection From Multi-modal Images
5	Research On Multi-agent Collaborative Perception Semi-supervised Online Evolutive Learning
6	Research On Open-Set Object Detection Method Based On Multi-Modal Learning
7	Research On Self-supervised 3D Object Detection Based On Stereo Images
8	Research On The Application Of Geometric Information In The Semi-supervised Learning
9	Design And Implementation Of Semi-supervised Object Detection System Based On Federated Learning
10	Research On Multi-task Learning And Semi-supervised Learning Methods In Visual Navigation