| With the rapid development of social economy,the pressure of urban road network is increasing year by year.It is difficult to balance the road resources that can not be unlimited expansion with the increasing traffic demand.Under the background of traffic big data,it can not only provide optimization objectives for traffic managers,but also provide accurate and reliable travel information for travelers to identify and predict the state of urban road network traffic flow and analyze the formation mechanism and characteristics of traffic events.Based on the national key research and development plan "Key technology of one ticket" for comprehensive transportation trip of Beijing Winter Olympic Games ",combined with big traffic data,this paper uses statistical analysis,clustering method and machine learning theory to carry out in-depth research on the identification and prediction method of traffic flow state.On the basis of improving the effect of traffic flow state discrimination and prediction,this paper proposes a traffic event portrait generation method based on traffic flow state discrimination and prediction,so as to accurately distinguish and predict abnormal traffic flow state of road section,and comprehensively characterize the characteristics of traffic events.The main contents and achievements are as follows(1)Outlier detection and data interpolation of traffic flow based on cluster analysis and random forestBased on the collected traffic flow data,the outlier detection method based on threshold discrimination and K-means is proposed to accurately identify the outliers in the traffic flow data.Compared with the outlier detection of SPSS,the proposed outlier detection method can effectively identify the traffic flow outlier data without affecting the traffic flow law.Secondly,aiming at the detected abnormal data and missing data in the data structure,a missing value interpolation method based on random forest is proposed.In the case of different data missing rate,the accuracy of the proposed missing value interpolation method based on random forest can reach more than 85%.Compared with other interpolation methods,it has better stability and accuracy.(2)Traffic flow state automatic discrimination model based on improved SC-StackingCombining the advantages of unsupervised learning and supervised learning algorithm,a traffic flow state automatic discrimination model based on improved SC stacking is constructed.Firstly,the traffic flow state is divided into five categories by spectral clustering algorithm;secondly,in order to further improve the operation efficiency and overall accuracy of the stacking model,the overall improvement of the stacking model is carried out,the k-fold cross validation method is adopted to prevent the over fitting problem of the model,and the multi-layer stacking mode is adopted to improve the accuracy of the algorithm.The accuracy and kappa coefficient of the traffic flow state automatic discrimination model are 92.49% and 0.89% in working days,and91.87% and 0.87% in non working days.Compared with naive Bayesian model,random forest model,gradient lifting decision tree model and support vector machine model,it can greatly reduce the error and improve the accuracy,and has better robustness and generalization ability from the value of confusion matrix.(3)Traffic flow state prediction method considering spatiotemporal correlationStarting from the spatiotemporal correlation of traffic flow parameters,combining wavelet theory(WT-DTW)with deep learning theory(CNN-LSTM),a traffic flow state prediction method considering spatiotemporal correlation is proposed.The proposed WT-DTW method can effectively measure the spatial correlation of traffic flow,thus effectively building a sample set for machine learning algorithm,reflecting the spatial characteristics of traffic flow data;the proposed CNN-LSTM traffic flow parameter prediction model reflects the temporal characteristics of traffic flow data,and then can accurately predict traffic flow parameters.Case study shows that the accuracy of the proposed traffic flow state prediction method is 78.68%,85.39%,88.14% and 82.14%respectively in the traffic flow parameter prediction with 5-minute,15 minute,30 minute and 60 minute intervals,which is better than ARIMA and LSTM methods.(4)Traffic event portrait generation method based on traffic flow state discrimination and predictionBased on the proposed traffic flow state automatic discrimination model and the traffic flow state prediction method considering temporal and spatial correlation,the traffic event characteristics are characterized by the degree of instability of traffic flow state,the influence range and duration of traffic events,and the traffic event portrait generation method based on traffic flow state discrimination and prediction is established to depict the temporal and spatial characteristics of road network traffic flow state The degree and intensity of instability are used to predict the evolution trend of traffic flow parameters of road network and reproduce the traffic event picture. |