| Environmental perception is the first link of Autonomous driving,and it is also the quick understanding and understanding of the surrounding environment of the in-car system.Sensation refers to the hardware part responsible for collecting information about the surrounding environment,and knowledge refers to the algorithm’s understanding of the information collected by the hardware.3D object detection refers to the detection of physical objects from sensor data,prediction and estimation of physical information such as target category,bounding box and spatial location.It is the core of perception system and scene understanding,and the basis of decisionmaking control links such as path planning,motion prediction and emergency evasion.In recent years,with the breakthrough progress of deep learning in target detection and depth estimation,the performance of deep learning-based detection,recognition and segmentation algorithms is outstanding,which provides new ideas and methods for the study of perception system.With the development of sensor technology,such as the increase of lidar’s number of equivalent lines,the significant reduction of cost and the widespread use,depth information can be provided to the sensing system through a variety of methods.Target detection performance under a single sensor gradually surpasses that of multi-sensor,avoiding technical difficulties such as time continuity synchronization and angle inconsistency caused by multi-sensor fusion.It also greatly promotes the development of 3D object target detection technology.Based on this,this paper uses Lidar point cloud data and camera image data respectively,and combines depth learning methods to achieve 3D object target detection tasks around sparse point cloud scenes and depth-estimated 3D image data,and conducts theoretical analysis,method validation,result analysis,etc.The main research contents are as follows:(1)The classical 3D object detection algorithms and research status at home and abroad are investigated.Firstly,the history and principle of 3D object detection algorithm based on depth learning are investigated,and the reasons for its performance improvement are analyzed.Then,typical 2D and 3D object detection algorithms are analyzed in detail.The working principle and type classification of the mainstream sensors(camera and lidar)used in auto-driving are studied.The main algorithms are classified and compared by using data types and data representation and processing methods.The advantages and disadvantages of various methods in the field of auto-driving are analyzed,and the possible development direction of 3D object detection algorithms in the future is also discussed.(2)Research on 3D object detection technology in voxelized scene.In view of the sparseness and large amount of data of LIDAR point cloud data in target scenes,this paper improves a 3D object detection algorithm in voxelized scenes.The algorithm divides the target space into voxels.A 3D backbone network based on sparse convolution is proposed to convert voxels into 2D data in the form of column voxels quickly,which improves the training and detection speed of the algorithm.2D information is processed by 2D backbone network.At the same time,the voxel features of different scales in 3D backbone network and the results of 2D backbone network detection are input into the multiscale voxel feature aggregation module.The spatial information of point cloud is fully learned.The result is further refined by the loss function to predict the location and category of the target.Experiments on datasets show that the algorithm achieves a balance between speed and accuracy,and is effective for large target recognition.(3)3D object detection from monocular images based on depth estimation.3D object detection algorithm based on camera image data is presented to solve the problem of poor robustness of the perception system caused by overdependence on a single sensor(lidar)in automatic driving.The algorithm predicts the depth of a single camera image pixel by estimating the depth to obtain data in the form of a pseudo point cloud.Then,it is transformed into a voxel grid,and 3D object detection voxel-based algorithm is used for object detection.The final results show that the performance of3 D object detection based on monocular image still has advantages,and can be an effective complement to the 3D object detection based on LIDAR point cloud. |