| 3D point cloud vehicle detection aims to identify and locate the point cloud representing the vehicle in the 3D point cloud data.It is the key technology to accurately perceive the surrounding information and avoid vehicle collisions in the automatic driving scene.Due to the factors such as uneven 3D spatial sampling,limited sensor range,and occlusion,the point cloud data obtained by lidar is sparse and has a highly variable point density,which makes the mature convolutional neural network structure in 2D unable to be directly applied to the point cloud.At present,3D point cloud detection usually divides the sparse and disordered point cloud space into a regular form,uses 3D sparse convolution to extract the feature information of the point cloud,and finally converts it into a 2D bird’s eye view that can be processed by convolution neural network.However,after the point cloud feature information is converted to a bird’s eye view,it will cause the loss of point cloud data’s feature information and make the comprehensive detection results worse.To solve the above problems,this thesis introduces an attention mechanism module to improve the utilization of feature information in bird’s eye view,and proposes a voxel-based region of interest pooling module to reduce the loss of 3D structural information in the conversion process.The specific research work is as follows:Firstly,a single-stage 3D point cloud vehicle detection algorithm based on an attention mechanism is proposed.The algorithm processes the sparse and disordered point cloud data into a regular voxel grid,uses 3D sparse convolution and auxiliary network to extract the point cloud features from the regular voxel grid,and finally converts them into bird’s eye view that can be processed by convolution neural network.To further extract the vehicle feature information in bird’s eye view,thesis introduces the channel-spatial attention mechanism module to amplify the vehicle information and suppress the non-key information in bird’s eye view.At last,the convolution neural network and warp transformation mechanism are used to detect the optimized bird’s eye view.The algorithm proposed in thesis is tested on the KITTI data set.At the moderate level,the 3D vehicle detection accuracy reaches 80.30%and the vehicle directivity prediction accuracy reaches 93.18%,which has better direction prediction and higher detection accuracy compared with the existing excellent algorithms.Secondly,a two-stage 3D point cloud vehicle detection algorithm based on voxel region of interest pooling feature aggregation is proposed.The 3D structure is crucial for 3D detectors,and bird’s eye view representations are still insufficient to provide information to accurately predict bounding boxes.To reduce the loss of feature information in the process of converting 3D to 2D,thesis proposes a two-stage detection algorithm based on voxels to reduce the loss of 3D structural information.The core of the algorithm is to use the region of interest pooling module to extract the aggregated features of voxels in the second stage and input the extracted features to the detection head to further optimize the detection results.The algorithm achieves 81.68%accuracy in 3D vehicle detection at the moderate level and 94.64%accuracy in vehicle direction prediction.The experimental results show that the optimized two-stage detection algorithm achieves very good experimental results on the KITTI dataset. |