Font Size: a A A

Research On Driving Scene Perception And Modeling Using Multi-sensor Fusion For Autonomous Vehicles

Posted on:2022-11-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:K W WangFull Text:PDF
GTID:1522306818963259Subject:Vehicle Engineering
Abstract/Summary:PDF Full Text Request
Accurate and efficient perception and modeling of the surrounding environment of intelligent vehicles is a necessary prerequisite for realizing high-level autonomous driving.It is a concentrated expression of the degree of vehicle intelligence and is also one of the important topics that are currently attracting attention in the field of intelligent vehicles.Roads with various structures and dynamic and static traffic elements with various types and uncertain behaviors together constitute a regular and changeable driving environment.In order to meet the needs of real-time dynamic decision-making and planning of intelligent vehicles,it is necessary to study efficient and accurate comprehensive understanding and hierarchical expression of the environment based on the decoupling perception of dynamic and static elements for the local driving environment with high diversity and uncertainty.Aiming at problems such as underexpression of the model caused by the complex occlusion relationship of environmental elements in the process of driving environment understanding and modeling and the lack of semantic relationship between elements,the goal is to achieve redundant and enhanced comprehensive understanding of local driving scenes.A multi-level local driving scene model including road grid map,road height grid map and dynamic and static semantic grid map is constructed based on deep learning methods.The main research contents of this thesis are as follows:(1)Aiming at the problems of lack of training data required for local driving environment modeling and low manual labeling efficiency,a semi-automatic training data generation method based on public datasets is studied.First,based on the public KITTI semantic segmentation dataset,the occlusion-free road segmentation label was expanded,and an occlusion-free road segmentation dataset for training and testing of road grid map model was constructed;secondly,a method of constructing a multitask road perception dataset based on the public Semantic KITTI dataset was proposed using multi-frame aggregation and image registration;finally,a method of constructing a semantic grid map dataset based on the public Semantic KITTI dataset using multi-frame aggregation and density clustering was proposed.(2)Aiming at the lack of expression caused by occlusion in the process of constructing road grid maps based on visual images,the method of constructing road grid maps based on occlusion-free road segmentation is studied.First,an occlusionfree road segmentation model based on a deep convolutional neural network is constructed to solve the problem of roads being occluded by on-road objects,achieving complete road detection under the perspective of the image.The model was trained and tested on the occlusion-free road segmentation dataset constructed in(1).Experimental results show that the proposed model can effectively learn the occlusion mechanism in driving scenes,obtaining precsion of 93.2%.On this basis,the inverse perspective transformation is performed on the occlusion-free road detection results under the perspective view,and the road grid map under the bird-eye view is constructed,and a complete road area expression suitable for the planning and control of intelligent vehicles is obtained.(3)Aiming at the comprehensiveness,accuracy and real-time requirements of road perception,the multi-task road perception method based on Li DAR point cloud is studied.In order to obtain information such as complete road area,road height and road type at the same time,a cascaded shared,lightweight multi-task road perception model is constructed based on convolutional neural network,which realizes occlusion-free road segmentation,road height estimation and road type recognition other multi-task joint learning.The proposed method realizes the understanding of roads at different levels,covering road representations from concrete to abstract,and is more adaptable to bumpy roads.The experimental results of training and testing on the multi-task road perception dataset constructed in(1)show that the occlusion-free road segmentation task achieves accurancy of 96.8%,the dense road height estimation task obtains L1 error of 6cm and the road topology recognition task gets a MIo U of73.6%.The proposed multi-task joint learning architecture has the characteristics of high efficiency,real-time and flexibility compared with existing related research,and is more applicable to the actual application of intelligent vehicles.(4)Aiming at the problem of the lack of semantic association between environmental elements in the modeling of complex driving scenes,the method of constructing dynamic and static semantic grid maps based on the fusion of Li DAR and camera is studied.An end-to-end convolutional neural network model of early fusion and mid-term fusion is proposed,and a complementary and enhanced mechanism of multi-layer perception model is established at the two levels of raw data and features.In the early stage of fusion,point-by-point feature matching and feature transformation are performed on the point cloud and the image,and then the feature combination is realized.In the mid-term fusion stage,in view of the feature sparseness of Li DAR point cloud,multi-scale image feature fusion is performed cellby-cell based on the estimation results of candidate grids.The model is trained and tested on the semantic grid map dataset constructed in(1).The experimental results show that compared with the basic control model,the fusion method improves the dynamic and static semantic grid map construction tasks by 2.3% and 1.2%,respectively.The proposed method can effectively use multi-modal perception information to achieve a semantic grid map with separation of dynamic and static elements achieving efficient and accurate expression of local driving scenes.On the basis of the above research,this thesis uses evidence-based reasoning methods to integrate and optimize the constructed layered local driving scene model.On the constructed intelligent vehicle research platform,Tests in typical driving scenes such as closed parks,urban roads and highways have been carried out,and the experimental results show that the proposed method can handle multiple types of driving scenes,and achieves a redundant and enhanced comprehensive understanding of local driving scenes.
Keywords/Search Tags:Intelligent vehicle, environment perception, semantic grid map, multi-sensor fusion, convolutional neural network, multi-task learning
PDF Full Text Request
Related items