| Autonomous driving technology is an important application of deep learning,which has attracted widespread attention in both industry and academia.The ability to perceive road elements is particularly important as the cornerstone of autonomous driving technology,and lane markings are an important component of road traffic signs.Due to the complexity of road scenes and the varying degrees of deformation of lane markings from different perspectives,accurate lane detection still faces many challenges.Early lane detection methods mainly evolved from bottom-up semantic segmentation methods,but due to large variations in the number of lane markings in different scenes,cluster-based methods often require complex post-processing modules and have poor overall performance.Recently,methods based on row classification have simplified the lane detection task by using prior assumptions specified by humans,achieving good results on public datasets.However,once the application scenario does not meet the prior assumptions,the model cannot work properly.To address the problems of current methods,this thesis proposes new algorithms and models based on the ideas of multi-keypoint and multi-scale feature fusion,with the following main research content:1.Propose a keypoint detection method under curved guide line constraints to enhance the model’s localization and recall of keypoints.The curved guide line reduces the degree of freedom of keypoints,increases the grazing angle between the lane and the guide line,improves the keypoint response intensity and aggregation degree,and is an important cornerstone for the model to recall lane markings.2.Propose a new lane description method based on multiple keypoints(Chain-anchor).This method solves the problem that the existing lane description methods always use a large number of prior assumptions specified by humans,which leads to poor model generalization in complex scenes,and can further improve network performance by using the Multi-Reference Point Deformable Cross Attention Module(MRDA)to incorporate position prior information into the network.3.Propose two lane IoU calculation methods: point-to-point IoU algorithm and dense sampling Io U algorithm.Point-to-point Io U is a fast Io U calculation algorithm specifically designed for point set lane representation methods,which has small computational cost and supports parallel computation.The dense sampling Io U algorithm is based on discrete sampling and is suitable for any lane representation method.It is more accurate than the point-to-point Io U algorithm,but has a higher computational cost.By applying Io U loss in the training phase,the overall optimization idea is introduced,which improves the model’s training efficiency and inference performance.4.Utilize FPN to fuse the multi-scale features extracted by the backbone network and integrate the above technologies to design two top-down end-to-end lane detection networks for different application scenarios: CANet based on convolutional neural networks for high precision and high real-time scenarios,and LDTR based on Transformers for high computing power and high recall scenarios.This thesis implements the CANet and LDTR prototype systems based on Py Torch and MMDetection,and evaluates them on three well-known large-scale lane detection datasets.The experimental results show that the minimal version of CANet surpasses the previous SOTA model in different metrics with only 29% of its computational cost.LDTR achieves higher recall than CANet by using more computing power. |