In the process of vehicle intelligence,intelligent driving technology represented by advanced assisted driving systems has been further developed,and both academia and industry have made great progress in theory and technology.However,the complexity of fully autonomous driving determines that it is difficult to achieve in the short term,so the transition from traditional driving to human-machine co-driving is an inevitable trend.The driver is the core of the"driver-vehicle-road"closed loop,and it is also the link with the greatest uncertainty.Relevant studies have shown that most traffic accidents are caused by improper operation of drivers.If the driving intention and the driving dynamics of surrounding vehicles can be known in advance,the intelligent driving system can provide early warning or cooperate with the driver to control the vehicle,thus avoiding the occurrence of traffic accidents to a large extent.The study focused on the recognition of driving intention in human-machine co-driving.Aiming at the shortcomings of existing researches,an intention recognition model based on fusion of spatiotemporal features was proposed.The model took the videos in the cockpit and forward traffic scenes as inputs.Furthermore,the research on the vehicle trajectory prediction incorporating driving intention recognition was further carried out.The proposed method was correspondingly compared and validated by Brain4Cars and Next Generation Simulation(NGSIM)datasets,and the main contents of this paper are as follows:(1)The two-stream network was used to extract the spatiotemporal features of the driver’s behavior in the cockpit for driving intention recognition.Driver behavior is the most direct manifestation of driving intention,but it is difficult for Convolutional Networks(CNN)to obtain the temporal dimension of the video.Therefore,this paper realized the extraction of spatiotemporal features based on the 3D convolution structure.The slow branch in the dual-stream network was mainly used to extract spatial semantic features,the fast branch was used to capture motion features,and the two branches were fused at each stage.In terms of data processing,a variety of data enhancement methods are used to expand the sample data,and the network was initialized with the weight parameters trained on the Kinetics dataset by means of transfer learning.The results show that the model can effectively identify the driver’s intention,the recognition accuracy was 81.69%,and the1 score was 80.31%.(2)The traffic scene data was introduced to enhance the spatiotemporal features extracted by the dual-stream network to further improve the recognition accuracy of driving intention.Driving behavior and traffic scene data are processed by slow and fast branches,respectively.Considering the difference in video data inside and outside the cockpit,the temporal resolution of the two branches was adjusted to find a suitable frame rate ratio.A synchronization module was added to generate a new input frame sequence and assign it to two branches,and a Gate Recurrent Unit(GRU)recognition module was introduced to improve the recognition accuracy.The effect of the model was verified with the input of optical flow images,and compared with a variety of video action analysis algorithms and baseline methods.The results show that the information inside and outside the cockpit were complementary,and the model accuracy and1score were greatly improved,which were 91.61%and 91.65%,respectively.(3)The prediction of vehicle trajectory was performed by means of driving intention assistance.Based on the Convolutional Social Long and Short Term Memory(LSTM)network,a self-attention mechanism was introduced to extract the temporal correlation of historical trajectories from the attention level.A mean pooling layer was added to optimize the convolution interaction module to further extract spatial interaction features.The target vehicle trajectory was predicted based on the NGSIM dataset and compared with other models,and the prediction accuracy was improved.In addition,the relevant data were collected with the driving simulator.The multi-modal trajectory prediction was made combined with the intention recognition probability vector of the dual-stream network,so as to further improve the prediction accuracy. |