The autonomous driving technology of intelligent vehicles has attracted extensive attention from industry and academia.Autonomous perception is a key component of the autonomous driving technology,and its task is to provide structured and semantic information for the decisionmaking module and the planning module,including the perception information of the external environment and the ego vehicle’s states.The neural network and deep learning technology have made a breakthrough in the field of image processing,which provides new ideas and methods for the research of the perception technology of autonomous driving.However,the autonomous driving perception relies more on the point cloud scanned by the Li DAR,so the structural design of the perception networks must consider the sparse and disordered characteristics of the point cloud.Thus,it is necessary to develop new network structures for point cloud processing.In the face of complex traffic environments,using the complementary characteristics between cameras and LiDARs can better deal with the changeable outdoor lighting and weather conditions,so as to further improve the performances of the perception algorithms.However,there are huge differences in internal properties and representations between the images and the point cloud.Resolving these differences is a major challenge in designing the fusion networks.In order to complete a variety of autonomous driving perception tasks,designing a general structure to realize the bidirectional fusion between images and point cloud is one of the urgent problems to be solved in this field.The information extraction and processing ability of neural networks are powerful,but the networks’ internal calculation is a black box.How to divide the autonomous perception tasks into multiple subtasks and how to effectively combine neural networks with interpretable algorithms(such as the observer technology)are also very challenging.Based on the state-of-art study,this thesis explores the use of deep learning and observer technology to solve the problems existing in the above autonomous driving perception algorithms,mainly involving 3D object detection based on the point cloud scanned by the Li DAR,bidirectional fusion between images and point cloud,and the estimation of the ego and target vehicles’ motions and states.The main work and the contributions are summarized as follows:· Research on 3D object detection network based on point cloud.By fully considering the sparse and disordered characteristics of the point cloud,this thesis proposes to use PointNet++ to construct the 3D object detection network,3D-Center Net.The network takes the original point cloud as the input,gathers the sparse features of the point cloud by preferentially estimating the center points of the objects,and then improve the estimation accuracy and increase the receptive fields of the center points’ features by multiple regression of the center points’ positions.Finally,the aggregated centers’ features are used to regress the high-quality bounding boxes.Experiments on the autonomous driving benchmark,KITTI,show that 3D-Center Net is a general 3D object detection network,which has achieved the leading performance compared with the networks in the same type on the category of “Car”and “Cyclist” in the KITTI benchmark.· Research on image and point cloud fusion network.A bidirectional fusion network between images and point cloud,PI-Net,is proposed to solve multiple tasks at the same time.Firstly,the images and point cloud are aligned through the pre-calibrated camera-Li DAR external parameters.Then,the unified representations of images and point cloud are completed by the way of points’ aggregation to pixels and pixels’ interpolation to points.So that the representations of image features and point cloud features can be converted to each other,so as to complete the bidirectional fusion.By designing different sub-networks to share the fused features,PI-Net can complete various autonomous driving perception tasks.The results on the road detection and 3D object detection tasks of the public benchmark,KITTI,and the point cloud segmentation task of the Semantic KITTI dataset show that the PI-Net is effective for the bidirectional fusion between images and point cloud.In the above tasks,the leading performance compared with the existing methods has been achieved.· Research on pose and velocity estimation of the ego and target vehicles.A neural network is proposed to estimate the pose of an external target vehicle and the ego vehicle.The points belonging to the foreground object and the points in the background environment are segmented to estimate the motion of the target vehicle and the ego vehicle,respectively.Then the reduced order observers are designed to estimate the velocities of the two vehicles.The convergences of the reduced order observers are proved based on the Lyapunov’s method.The results of urban transportation simulation in MATLAB and the real vehicles’ experiment in the campus scene verify the effectiveness of the proposed algorithm. |