Font Size: a A A

Fusing Camera Images And Millimeter Wave Radar Data On The Bird’s Eye View For 3D Detection In Autonomous Driving

Posted on:2024-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhaoFull Text:PDF
GTID:2542306932962289Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,automatic driving has become a hot research direction.As a prerequisite for decision making and prediction,3D object detection has become an indispensable part of perception module,which is still one of the challenging research fields in the field of computer vision until now.In order to achieve accurate and robust detection results,autonomous vehicles are usually equipped with various types of sensors,each of which has advantages and disadvantages.For example,cameras can provide high-resolution images,but perform poorly in low light and bad weather conditions.Millimeter-wave radar provides stable millimeter-level electromagnetic wave signals.But the models currently in use have low accuracy and do not provide longitudinal resolution.Therefore,this dissertation explores how to integrate the data of camera and millimeter wave radar through sensor fusion,so as to achieve a more reliable and accurate 3D target detection algorithm.There are many difficulties and challenges in the fusion of millimeter wave radar and vision sensor,which not only need to deal with the difference of input data scale,the change of perspective,the difference of data type and radar noise,but also need the stability and real-time performance of the algorithm.In view of the above difficulties and challenges,this dissertation proposes a fusion frame BEV-Radar for millimeter wave radar-camera,and makes the following contributions:(1)In view of the alignment problems caused by differences in perspective and data scale,due to the sparse points of millimeter wave radar and the lack of longitudinal resolution,the alignment of projection radar points with camera images will make it difficult to learn data distribution.In this dissertation,by analyzing the characteristics of the front view radar point projection FV-Align and the feature alignment of the bird’s eye view,it is proved that the data distribution of sparse weak modes should be avoided as much as possible when the dense strong modes and sparse weak modes are merged before the feature.Based on a visual frame with similar accuracy,the two experiments changed the data distribution of millimeter wave radar feature extraction through different perspectives without making major modifications to the frame,and proved that BEV was a more appropriate perspective than the front view when the millimeter wave radar and camera merged.(2)Aiming at the different performance characteristics of sensor data of different modes,this dissertation further proposes a Bidirectional Spatial Fusion strategy based on the bidirectional attention mechanism based on BEV.The fusion module includes the separate attention extraction and bidirectional attention fusion for each module,and the convolutional module is added to extract local spatially relevant features,so as to further improve the alignment of the millimeter wave radar and camera in the spatial shape.The BEV-Radar achieved an accuracy of 48.2mAP and 57.6NDS on the test set of the public data set nuScenes with real-time reasoning up to 10FPS.No matter compared with the pure visual network as the basic framework,or compared with other millimeter wave radar and camera fusion methods,BEV-Radar has been greatly improved in various tasks,especially for the speed term regression accuracy.In addition,this dissertation also conducts further experimental analysis in different weather and target distance scenarios to prove the stability ability of the fusion algorithm to cope with the changeable autonomous driving scenarios.
Keywords/Search Tags:3D Object Detection, Millimeter Wave Radar, Sensor Fusion, Au-tonomous Driving
PDF Full Text Request
Related items