| In recent years,the popularity of autonomous driving and computer vision has continued to rise,and the top priority of autonomous driving and computer vision is object detection.Autonomous vehicles usually use lidar with multiple optical lenses and cameras to achieve object detection,because lidar can collect more depth point cloud information,which is more accurate and three-dimensional than the information obtained from twodimensional images,and can deepen understanding of the environment.However,the current method of using point cloud data for 3D object detection has the problems of high amount of parameters and calculation and long object detection time,and there is still a certain gap from real-time detection,so it cannot be applied to actual scenes.Based on the above questions,this thesis will make lightweight improvements to the 3D object detection network to solve the above problems while maintaining the detection accuracy.The main research contents of this thesis are as follows:(1)A lightweight 3D object detection algorithm based on separation convolutional network(Complex-YOLOv4-Deep Separation Ghost,Complex-YOLOv4-DSG)is proposed.Firstly,the construction efficiency is improved by reducing the size of the twodimensional grid when constructing the bird’s eye view,so as to make full use of the maximum height,maximum intensity and maximum density channel feature information of the bird’s eye view.Aiming at the problem that the current feature extraction network YOLOv4-tiny has a large number of parameters and a number amount of calculations,this thesis proposes a lightweight improvement to the YOLOv4-tiny backbone network and use deep separable convolution to replace the standard convolution in the residual unit,and the residual Unit is improved to the inverted residual Unit,and finally,the channel feature map splicing is used to greatly reduce the overall computation of the network and enhance the feature information.(2)In view of the low detection accuracy of the improved lightweight 3D object detection method,This thesis propose to add a dual attention mechanism to the improved lightweight network and then perform 3D object detection method.After the different weights of each channel,the coordinate attention mechanism is used to embed the position information into the channel,the purpose is to let the network pay attention to more important features,and at the same time,the spatial position information is also added to the features,so as to ensure that the regression network has more more characteristic information.We also changed the loss function of the network,and proposed to replace the MSE(Mean Squared Error)loss function with the GIOU(Generalized Intersection Over Union)loss function to speed up the convergence of the network.This thesis conducts experiments on the KITTI dataset show that our improved attention-based algorithm has better performance,which significantly improves the average accuracy of 3D object detection,especially for cars and bicycles,The increase rate reached 2.16% and 4.09% respectively. |