| At present,with the acceleration of urbanization,the increase of urbanization population and the number of motor vehicles,people’s travel has become an important issue.More and more Internet companies and research institutions have invested huge human,material and financial resources in the field of travel to study travel path planning and road congestion,such as Baidu map,Gaode map,Didi travel and other companies.At present,there are many researches on spatiotemporal prediction of road traffic state.The data set used in this paper is the city road traffic state data set of Xi ’an in July 2019 provided by Didi.Data set is large amount of data,the data dimension is overmuch,sample type distribution is not balanced problem,if the direct use of these data sets can cause numerous long training time,training model of ordinary machine memory cannot accommodate,and we are more concerned with the minority class under the condition of sample can be accurately predicted,the data category imbalance problems affect the results will appear more prominent.In such data distribution situation,if the data set without any processing model training,may not be able to make an accurate prediction of the results,the final model is obviously tend to predict most of the samples,a few kinds of samples by prediction error probability is very big,so we sampling data set for processing,make each kind of data samples as balanced as possible.In order to make a small number of samples as accurate as possible to predict,we use F1 function to measure model performance.In this paper,the temporal and spatial prediction algorithm of road traffic state is studied in the following three aspects:First,new feature combinations are selected for experiments.Experiments were carried out by using the original data set without any feature extraction and the data set after feature extraction,and the experimental results were analyzed.It was found that the prediction accuracy of the model was higher when the data set after feature extraction was used to train the model under the same conditions.Second,different algorithms are used for model fusion.We use good data processing,with the traditional machine learning algorithms and deep learning algorithm model,using XGBoost algorithm,Light GBM algorithm of machine learning algorithms and deep learning algorithm of CNN neural networks,Res Net neural networks,and 5 fold cross-validation training,training every time after the completion of the forecast the result of a validation set and test set,with a GBDT gradient ascension finally tree model to predict the last round of the validation set as GBDT training set to forecast test set,The experimental results were measured by two evaluation functions,F1 and Accuracy.The results showed that the model fusion method had better prediction effect than the single model,whether F1 or Accuracy was used.Thirdly,the fruit fly algorithm is used to adjust the learning_rate parameters of Light GBM and XGBoost.In order to prove that Drosophile-fly algorithm has better parameter adjustment effect,we use Drosophile-fly algorithm and particle swarm optimization algorithm to carry out iterative experiments on learning_rate parameters of XGBoost model and Light GBM model respectively to find the best parameters.Through experimental comparison,it is found that the optimal parameters obtained by iterative optimization using Drosophile-fly Algorithm have better prediction accuracy than those obtained by particle swarm optimization algorithm. |