| Lane detection,as one of the important and indispensable functions in the environment perception,plays a key role in the autonomous driving.Lane detection has been applied to various advanced driving assistance systems(ADAS),it can assist autonomous vehicles to complete important functions such as lane departure warning(LDW),lane keeping(LK),and so on.Therefore,the research on lane detection is of great significance for autonomous driving.The shortcomings of the existing lane detection models are the following:a large number of memory occupied by weight parameters,low detection accuracy in challenging driving scenes.In addition,a real-time lane detection model is hard to balance the contradiction between weight parameters,detection accuracy,and running speed.To address these problems,in this thesis,we respectively construct the lane detection models from the perspective of spatial information,spatio-temporal information,and the fusion of spatio-temporal information and attention.The contributions of this dissertation are the following five parts:(1)In the first part(chapter two),we propose a lane detection model via vertical spatial convolutions.In the encoder phase,a pair of convolutions is used to increase the number of channels of feature maps,and reduce the network parameters of the proposed lane detection model.Then,a combination module is utilized to further compress the redundant spatial information into a valid and compact feature representation.Finally,a group of vertical spatial convolution blocks and efficient residual modules is employed to help the proposed model obtain more effective global context information of lane lines,which are used by the subsequent network layers to detect lane lines more accurately in some challenging scenarios.Furthermore,we verify the performance of the proposed model on two popular and diverse lane detection benchmarks,i.e.TuSimple and CULane.A large number of experimental results show that our model outperforms the state-of-the-art lane detection models.(2)In the second part(chapter two),considering the possibility of deploying lane detection models to the mobile platform,lane detection models must achieve a certain balance in terms of running speed,detection accuracy,weight parameter size and model’s robustness.In this chapter,to address these problems,we combine the strengths of residual blocks and U-Net to build a lightweight and robust network(LRNet)for lane detection.A comprehensive set of experiments based on the public large-scale TuSimple lane marking challenge dataset and the Unsupervised Labeled Lane Markers(LLAMAS)dataset show that(a)our resulting algorithm can achieve impressive results in some complex and challenging scenes,including high accuracy,fast speed and good robustness,outperforming other state-of-the-art lane detection methods;(b)Only when one residual block is used to replace one standard convolution in the last downsampling stage of the encoder part in the backbone network can the proposed model achieve the best lane detection results;and(c)While the model maintains high accuracy,the amount of memory occupied by the weight parameters of the proposed model is relatively small at only 3.15 MB.(3)The lane detection model based on spatial information has achieved some results,but pursuing the high accuracy of lane detection in challenging scenarios is still an open research question.The temporal continuity information in the image sequence can associate with the lane line features distributed in spatial information.Therefore,the importance of temporal information can not be ignored when designing lane detection models.The third part(in chapter three)builds a lane detection model via global spatio-temporal network information.the proposed model uses the dilated convolutions to increase its receptive fields in the process of obtaining lane information,to capture richer feature information of lane lines.And the skip connections from the proposed model provide necessary supplements for the possible missing lane line information.In addition,the spatio-temporal networks further enhance the learning ability of our model in extracting efficient features by dealing with spatial and temporal information via convolutional gated recurrent units(ConvGRUs).Furthermore,a large number of experiments verify that our model outperforms the state-of-the-art algorithms while increasing the robustness and reducing the size of the weight parameter,achieving 81.35%on Dynamic Vision Sensor Dataset(DET)and 73.0%on CULane.(4)Global features contain the high-level semantic information of lane lines,while local features contain more detailed low-level information of lane lines.Before predicting the final lane lines,the local features should also be remembered and used.For a model,to improve its lane detection ability in challenging scenes,in the fourth part(chapter four),a spatio-temporal network with double ConvGRUs is proposed.Both of ConvGRUs have the same structures,but different locations and functions in our network.One is used to extract the information of the most likely low-level features of lane markings.The extracted features are input into the next layer of the end-to-end network after concatenating them with the outputs of some blocks.The other one takes some continuous frames as its input to process the spatio-temporal driving information.Extensive experiments on the largescale TuSimple lane marking challenge dataset and Unsupervised LLAMAS dataset demonstrate that the proposed model can more effectively detect lanes in challenging driving scenes.Our model can outperform the state-of-the-art lane detection models.(5)The fifth part in chapter five,ConvGRUs have the ability to learn spatio-temporal feature information by dealing with video or continuous frames but exploring the capabilities of how to make them consciously learn the lane feature information is rarely covered.To address the problem,we utilize triplet attention to act on ConvGRUs both internally and externally to build a novel module,named TARConvGRU.Then,a lane detection model based on TARConvGRU is proposed.As a variant of ConvGRUs,TARConvGRU can help the proposed lane detection model extract and memorize the lane features more effectively and efficiently in several challenging scenes.We validate the effectiveness of the proposed model on three popular lane detection benchmarks:TuSimple,Unsupervised LLAMAS,and DET.The experimental results demonstrate that our model achieves competitive results comparing with other state-of-the-art models.In this thesis,we study the visual image features of lane lines in the driving scene from the perspectives of spatial information,spatio-temporal information,and attention.A series of lane detection models are presented by using multi-dimensional,multi-scale,and multi-feature analysis methods,which can provide some technical support for autonomous vehicles from the point of view of environmental perception.In addition,the high-precision,lightweight,real-time,and robust lane detection models proposed in this work can be effectively transplanted to the mobile development platform to improve the driving safety of self-driving vehicles. |