Three-Dimensional Object Detection Based On Deep Learning

Posted on:2024-08-06

Degree:Master

Type:Thesis

Country:China

Candidate:X X Huang

Full Text:PDF

GTID:2568307133951719

Subject:Traffic Information Engineering & Control

Abstract/Summary:

PDF Full Text Request

Three-dimensional object detection is one of the key technologies in machine perception,which is used to interpret the surrounding environment and detect the threedimensional information of surrounding objects.It is widely used in practical applications such as autonomous driving,smart cities,and intelligent robots.However,in threedimensional object detection,the loss of point cloud feature information often leads to a decline in detection performance.To address this,a method is proposed to input multiscale features into the detection head of the three-dimensional object detection network to improve detection performance.Firstly,by analyzing the methods and frameworks of three-dimensional object detection at home and abroad,it is established that the input of this paper’s threedimensional object detection network is grid-structured point cloud data.In subsequent research on point cloud object detection,it was found that existing feature extraction methods often lead to the problem of missing point cloud feature information.Therefore,a method is proposed to strengthen global and local feature extraction using attention mechanisms and graph convolutions,and further analysis is conducted on the direction of improvement.Secondly,in the improvement of attention mechanisms,a Transformer that performs well in global feature extraction is selected for improvement,and the improved network is named DRPT.A Transformer network applied to point clouds is created,which uses self-attention mechanisms to establish correlations between point cloud data.Then,normalization is performed using double-stochastic matrices to enhance the extraction of global features.To verify its superiority,experiments were conducted on the Model Net40 and Shape Net Part datasets,and compared to the baseline network,DRPT improved detection accuracy by 5.6% and 5.5%,respectively.The graph convolution that enhances local feature extraction is also improved,and the improved network is named 3DGGCN.This network first inserts a grid query module,which can further improve the accuracy and stability of local features while preserving local information.Deformable convolution kernels are then introduced,which can generate changes in convolution kernels based on the number of point clouds,further enhancing the feature extraction ability of point clouds.To verify the superior ability of the improved model in processing large point cloud scenes,experiments were conducted on the Semantickitti and Semantic3 D datasets,and compared to the baseline network,3DGGCN improved accuracy by 3.9% and 6.2%,respectively.The improved model showed significant improvements in all aspects compared to the baseline model.Finally,the improvements in attention mechanisms and graph convolutions are combined to form a feature enhancement layer,and combined with the Point Pillars network to achieve three-dimensional object detection.The improved network is named TG-Pillars.This network uses DRPT and 3DGGCN to respectively extract global and local features of point clouds,and fuses the two types of features with multi-scale feature fusion to solve the problem of lack of geometric features in three-dimensional object detection networks.TG-Pillars was validated on the KITTI dataset,and the model improved vehicle-level accuracy by 2.16%,pedestrian-level accuracy by 3.84%,and bicycle-level accuracy by 2.09%.In subsequent field applications,the model was applied to real-time laser point cloud object detection in ROS.This means that the model has broad application prospects in various practical applications such as autonomous driving,smart cities,and intelligent robots.

Keywords/Search Tags:

3D object detection, attention mechanism, Transformer, graph convolution, feature fusion

PDF Full Text Request

Related items

1	Research On Salient Object Detection Based On Feature Fusion And Backbone Network Optimization
2	Research On Saliency Object Detection Algorithm Based On Feature Fusion And Attention Mechanism
3	Research On Multi-scale Object Detection Method Based On Deep Learning
4	Research On Object Detection Method Based On Key Points And Graph Spatio-temporal Attention Mechanism
5	Video Object Detection Based On Attention Mechanism And Multi-Scale Feature Fusion Convolutional Network
6	The Research On Key Technologies Of Object Detection Based On Deep Convolutional Neural Networks
7	Video Object Detection Based On Adaptive Convolution Network And Visual Attention Mechanism
8	Research On Object Detection Algorithm Based On Feature Pyramid Fusion And Attention Mechanism
9	Research On Object Detection Algorithm Based On Feature Fusion And Attention Mechanism
10	Research On Pedestrian Trajectory Prediction Based On Transformer And Graph Convolution Network