Font Size: a A A

Research On Point Cloud 3D Object Detection Method Based On Deep Learning

Posted on:2024-06-18Degree:MasterType:Thesis
Country:ChinaCandidate:J W QinFull Text:PDF
GTID:2568307142451814Subject:Software engineering
Abstract/Summary:PDF Full Text Request
3D point clouds can obtain richer and more comprehensive environmental information compared with 2D images,which makes 3D object detection in point clouds an indispensable part of fields such as autonomous driving systems,robot navigation,and virtual reality.outlier points have a great impact on the accuracy of classification results in point cloud object detection.At the same time,when generating multiple candidate frames,there is the problem that different candidate frames are difficult to distinguish from each other because they contain the same sampling points.In addition,although point cloud data contains depth information,also lacks color texture information,which also reduces the accuracy of object detection to a large extent.To address the above problems,this paper proposes a 3D object detection method based on outlier point weakening and voxel coding,and a point cloud-image multi-modal adaptive fusion detection algorithm respectively,which are mainly studied as follows:A point cloud 3D object detection method based on outlier weakening and voxel encoding is proposed.The outlier weakening strategy is proposed to address the negative impact of outlier and noise points in the point cloud data on the classification refinement results.The multi-way search algorithm is used to find the neighborhood points of each point in the point set,to reconstruct them with normalization,and to obtain the correlation between each sampled point and its neighborhood points through the self-attention mechanism,to enhance and utilize the key information of its neighborhood and weaken the negative impact of outlier points.A voxel encoding strategy based on adaptive pooling is proposed to address the ambiguity problem caused by the existence of the same sampling points in different candidate frames when multiple candidate frames are generated in object detection.The candidate frames are divided into voxels,and each voxel is further divided into multiple pillars,and weighted aggregation is performed based on the importance of each pillar to achieve discriminative spatial voxel coding.This algorithm achieves an average accuracy of 82.98% and 93.2% on the KITTI dataset Car category and Model Net40 dataset respectively.A point cloud-image multi-modal adaptive fusion detection algorithm is proposed.The 2D and 3D semantic segmentation algorithms are used to extract the semantic information of the image and point cloud respectively,and the two semantics are adaptively fused by the semantic fusion module based on the attention mechanism,and the fused data are fed to the 3D detector for target detection.To avoid the inefficiency of using classical convolution kernel and parameter sharing strategy on sparse point clouds,local and global attention units are constructed in this paper.A local region is constructed inside each voxel using a spherical query and weighted aggregation is used to pool local information,and global information at long distances is captured by the global attention unit.In addition,to address the data alignment bias problem when data enhancement techniques are applied to joint point cloud-image detection,this paper proposes a data enhancement reversal strategy to improve the accuracy of the enhanced multi-modal data alignment by retaining the parameter information of geometry-related data enhancement and applying it back to the enhanced data in the data fusion stage.The algorithm achieves an average accuracy of82.1% and 73.7% in the "Vehicle" category of the Waymo dataset for LEVEL 1 and LEVEL 2,respectively,and 66.7% and 70.1% in the nu Scenes dataset for m AP and NDS respectively.
Keywords/Search Tags:point clouds, 3D object detection, outlier point weakening, voxel encoding, multi-modal fusion
PDF Full Text Request
Related items