Font Size: a A A

Research On 3D Object Detection Based On Label Guidance

Posted on:2024-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:X M LiuFull Text:PDF
GTID:2568307052495654Subject:Electronic information
Abstract/Summary:PDF Full Text Request
3D object detection helps computers accurately understand the real 3D world and is the technical basis of the autonomous driving and service robot industry.3D object detection usually includes two steps: feature extraction and bounding box prediction.The difficulties focus on occlusion and small targets.The existing 3D object detection algorithms improve the detection accuracy by directly improving the feature extraction and detection head structure,which does not solve the problem of target information loss caused by point cloud sampling,and also produces more complex network modules,which increases the pressure of model training and reasoning.In this thesis,we propose to use label guidance to improve the accuracy of 3D object detection,learn key features from point cloud labels,and integrate spatial geometry and other information to assist the training of existing3 D object detection models.The proposed label-guided auxiliary network removes no additional inference calculation cost in the inference phase,and the optimized detection model can effectively improve the detection performance even if it inputs the same sampling point cloud as the original.In addition,we continue the idea of label guidance and propose a multimodal label fusion auxiliary network to improve the performance of 3D object detection model.The innovation of this paper mainly includes the following four aspects:(1)We propose a label-guided auxiliary method for 3D object detection.The point cloud label data is used to assist the feature learning of the network.A two-stage one-way training method is proposed to achieve auxiliary optimization,and pseudo-label enhancement is used to improve the robustness of network detection.(2)Aiming at the problem of target information loss caused by point cloud sampling,it is proposed to use label point cloud(point cloud in the label bounding box)to make up for the missing target information.An attentionassisted network that combines label point cloud and original point cloud is designed to guide the existing 3D object detection network to learn point cloud features purposefully and improve the detection performance of the network for occluded targets and small targets.(3)The spatial feature coding network of label point cloud is designed to fuse the spatial features of label point cloud into the point-by-point feature embedding.It is proposed to encode the feature of the bounding box parameter information marked in the point cloud data label,and to implement the fusion of high-level target semantic information and low-level parameter features by cross-attention fusion of the label point cloud coordinate features and label parameter features,so as to further supplement the feature embedding of the target point cloud.(4)A multimodal-based label guidance method is proposed to guide the network to purposefully fuse the target features under different modalities,including the geometric information of the point cloud and the target color information in the image label.Fully considering the reciprocity of the two modes,the complementary advantages improve the detection performance of the model.A 3D object detection method based on multi-modal label guidance is proposed,which combines the accurate position information in the original point cloud and the rich target object color and dense pixel information in the image label.A multi-modal bidirectional fusion method based on label is designed,which fully considers the reciprocity of the two modes,and make up for the shortcomings of point cloud and image.The advantages and disadvantages are complementary to provide more effective feature information for object detection and improve the detection of the model.In this thesis,a variety of ablation and comparison experiments are carried out on different indoor and outdoor detection models and data sets,which verifies the rationality and effectiveness of the proposed method.
Keywords/Search Tags:3D Object Detection, Auxiliary Training, Cross Attention, Multimodal Fusion
PDF Full Text Request
Related items