Font Size: a A A

Monocular 3D Object Detection Ahead Of Vehicles

Posted on:2022-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:X F ShuFull Text:PDF
GTID:2518306605968089Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the development of image processing technology and artificial intelligence deep learning technology,vision-based object detection has achieved very ideal performance in terms of accuracy and speed.However,in the field of autonomous driving,only relying on the 2D detection results on the image plane cannot effectively obtain the position and structure information of the object in the 3D space.Although it is possible to obtain space information by relying on sensor devices such as lidar,a monocular camera is an economical and convenient solution in comparison.Therefore,the research on 3D object detection technology under monocular vision has important practical significance and broad application prospects.With the wide application of deep learning technology in the field of 3D object detection,a large number of excellent algorithms have emerged,among which the real-time monocular3 D detection from object keypoints for autonomous driving(RTM3D)has excellent performance.The algorithm predicts the 9 keypoints of the projection of the 3D bounding box in the image space,and then uses the geometric relationship between the 2D and 3D perspectives to recover the dimension,position and orientation of the object in the 3D space.To solve problems of RTM3 D in detection,this thesis conducts algorithm research and improvement.The main contributions are as follows:(1)To solve the problem of poor prediction accuracy of the 9 projection keypoints by the RTM3 D algorithm,a 3D object detection algorithm based on keypoints cascaded regression(R-RTM3D)is proposed.A simple and efficient fully convolution detection framework is designed to perform initial regression prediction on projection keypoints.Then a region perception module based on deformable convolution is proposed,extracting features from the local area where the initial projection keypoints are located.Based on regional features,the initial projection keypoints are adaptively refined.Through the cascaded accurate estimation of the projectrion keypoints of the object,the accuracy of the 3D object detection algorithm under monocular vision is significantly improved.(2)To solve the problem of multi-scale and occluded objects in images,a 3D detection algorithm based on recurrent criss-cross attention mechanism(RCCA)is proposed.A global information attention aggregation module based on RCCA are designed to solve the problems of fixed receptive field of traditional convolution unit and its inability to extract global information,realize self-adaptive feature area attention for the situation of objects,and effectively capture global scene features to predict 3D attributes of the object.While ensuring high computational efficiency,the 3D detection accuracy for multi-scale objects is improved,and the problems caused by occlusion and truncation are effectively alleviated.Combining the proposed keypoints cascaded regression module and RCCA,a monocular 3D detection algorithm(CR-RTM3D)that can cope with complex scenes is realized.Through experiments on the KITTI dataset,the effectiveness of each module is verified,and the proposed algorithm can significantly improve the accuracy of 3D detection.According to the research results of this thesis,a 3D object detection system is designed.Through the 3D detection of the vehicles and pedestrian ahead of camera view,the prediction of the collision event in front of the car is realized.
Keywords/Search Tags:Monocular vision, 3D object detection, Keypoints, Multi-scale, Attention
PDF Full Text Request
Related items