| With the development of the national economy,car ownership has reached 360 million in China.This rapidly increasing traffic volume has caused a decline in the operating capacity of the highway network,a high incidence of traffic accidents,increased urban environmental pollution,and difficulties in traffic operation management.Timely and accurate forecast of traffic flow is one of the themes studied in Intelligent Transportation Systems(ITS),which can help traffic control departments to restrict and induce the outbound traffic flow in advance to improve travel efficiency.In view of the existing research,there are problems such as incomplete extraction of temporal and spatial features of traffic flow,low prediction accuracy,low computational efficiency when processing massive data samples,and complex training parameters.Thus,in this paper,we propose a highway traffic flow prediction model based on distributed memory computing.The main work and innovations of this paper are as follows:In this paper,firstly we analyze the original data of highway traffic flow,and complete data preprocessing such as data filling,data conversion and data reduction.Secondly,in order to analyze the influence of the time and space characteristics of the highway traffic flow on the traffic flow prediction results,a method for constructing the characteristic vector of the time and space relationship of the traffic flow is proposed,and the method is used to generate the time and space characteristic vector of the highway traffic flow.Again,in order to accurately predict highway traffic flow,this paper presents a Bayesian optimization algorithm optimize extreme gradient boosting(Based Bayesian Optimization eXtreme Gradient Boosting,BO-XGBoost)model that considers temporal and spatial characteristics to achieve highway traffic flow prediction.The temporal and spatial feature vectors of the traffic flow are input to the model,and multiple trees are generated by continuously splitting the features,thereby obtaining the optimal solution of the model.In order to avoid the problem of overfitting due to inappropriate parameter values,the global optimization algorithm of Bayesian Optimization Algorithm(BO)is used in optimized the important parameters of the model to achieve accurate prediction of highway traffic flow.Finally,in order to reduce the calculation time of the BO-XGBoost model in parameter optimization and the construction of the optimal tree model,this paper proposes an optimized extreme gradient boosting highway traffic flow prediction(Optimized eXtreme Gradient Boosting model based on Spark,Spark-BO-XGBoost)model based on Spark,by deploying the BO-XGBoost model on the Spark distributed computing platform,it realizes parallel calculation of parameter optimization and optimal tree model generation.The experimental results prove that the BO-XGBoost prediction model considering the temporal and spatial relationship has higher prediction accuracy and better overall performance.And the BO-XGBoost model parallelized by Spark has good speedup and scalability,and the computing efficiency is also significantly improved.The method in this paper can improve the computational efficiency of the model while ensuring the prediction accuracy of the highway traffic flow model. |