| Autonomous driving systems are divided into three major parts: environment perception,planning and decision-making,and vehicle control.Lane detection,as the most fundamental environmental perception task,plays a significant role in ensuring the safety of automatic driving.Traditional lane detection techniques use manually designed features and geometric mathematical models,which are sensitive to environmental changes and have poor generalization and robustness in real-world scenarios.Deep learning detection methods,with the powerful representation learning abilities and massive data resources,can be applied to complex and variable lane scenarios.However,2D lane detection suffers from network structures that are not suitable for handling scenarios with weak appearances,complex model prediction leading to high computational load,and limitations in detecting scenarios with fixed structures.3D lane detection faces challenges such as low accuracy of three-dimensional information in virtual space transformations,and inconsistencies between 2D and 3D detection tasks.Therefore,to addresses these issues,this thesis conducts research on lane detection algorithms based on deep learning,focusing on both 2D and 3D detection tasks.The main contributions are as follows:(1)For 2D lane detection task,this thesis abstracted the lane lines as a series of discrete key points and proposes a 2D lane detection method of parallel multi-scale feature aggregation based on key points,FPLane.The main core of FPLane is to focus on the precise positioning of key points in the global lanes and aggregate the global detection results into the local geometric modeling of lane lines using the idea of association embedding.Morever,this work proposes a parallel multiscale feature aggregation network(MFANet)in FPLane,to address complex scenes with weak appearance clues.MFANet can integrating global spatial information from multiple scale feature mappings,fully utilize prior information from other lanes,and capture spatial relationships related to lane lines and enrich feature information even in cases of occlusion or other unclear markings.The proposed method achieved state-of-the-art performance on two mainstream datasets,with performance scores of 96.82% and 75.2%,respectively,and the real-time detection efficiency of the model can maintain 28 ms.(2)For 3D lane detection task,this thesis introduces a Transformer to focus on local lane representations,and uses camera intrinsic and extrinsic parameters to apply inverse perspective transformation to generate fine-grained bird’s eye view spatial features,proposes a 3D lane detection method of multi-scale Transformer,3D Lane-Vi T.This method constructs a multiscale Vision Transformer to model the transformation of the feature space as a learnable process.It uses internal attention mechanisms to capture prominent features of local regions in the frontal view as well as the interactions between the frontal and top views.Additionally,a new geometric anchor representation for 3D lanes is designed to align better with anchors in 2D detection tasks,unifying 2D and 3D lane detection tasks in the same framework.The proposed method was compared and validated on two large-scale 3D lane detection datasets.The results showed that this method achieved performance scores of 51.9% and 72.09% on the 3D detection task,respectively,which outperformed other models.In the 2D detection task,the method outperformed the baseline method by 12%. |