| Big data analysis technology based on artificial intelligence,machine learning,data mining and other intelligent computing has become a research hotspot in the current scientific field,using big data technology to improve traffic management capability is a new concept and new practice to build smart cities.Traffic flow forecasting is a key to building intelligent transportation systems and providing real-time traffic applications,especially accurate traffic flow forecasting can help develop reliable traffic management and control strategies.Based on the Spark distributed computing platform,this paper combines large-scale mobile trajectory data(taxi GPS trajectory data),research traffic flow forecasting methods and its applications,is intended to improve the accuracy,robustness and scalability of traffic flow forecasting.Dynamic monitoring and early warning control of complex transportation networks provide theoretical basis and technical support.The main contributions of this paper are summarized as follows:1.Data preprocessing.To solve the problem of calculation and storage in treating with large-scale mobile trajectory data employing the centralized mining platform,the distributed storage and parallel computing of mobile trajectory data is achieved based on the Spark distributed computing platform;Next,to reduce the influence of the difference between the trajectory data and the original data on the prediction accuracy,using the resilient distributed dataset(RDD)for data reading,data sorting,data statistics,data integration and data storage to realize traffic flow data preprocessing;Finally,the kalman filter(KF)are used to smooth large-scale mobile trajectory data.2.Traffic flow forecasting based on a distributed SW-Bi LSTM model.Aiming at the calculation and storage problems of the traditional bidirectional long short-term memory(Bi LSTM)neural network model in the processing of traffic big data under the centralized mining platform,a Spark based weighted bidirectional long short-term memory(SW-Bi LSTM)model to improve the robustness and accuracy of traffic flow forecasting is presented.Specifically,a distributed SW-Bi LSTM model on Spark is put forward,combined with the normal distribution for weighing the influence degree of the interaction between adjacent road segments;Next,weighted traffic flow was sent into Bi LSTM model for training by time window,and capture the past-future traffic flow information;Finally,the traffic flow forecasting of distributed SW-Bi LSTM model is realized with the real-world GPS trajectories of taxicabs.3.Traffic flow forecasting based on the parallel NAW-DBLSTM algorithm.Aiming at the problems of the inability to effectively consider the influence of the spatial correlation among the road segments and the difficulty in capturing the nonlinear characteristics of traffic flow,a parallel normal distribution and attention mechanism weighted deep bidirectional long short-term memory(NAW-DBLSTM)algorithm on Spark is presented.Specifically,the parallel NAW-DBLSTM algorithm is proposed on a Spark distributed computing platform for enhancing the accuracy and scalability of traffic flow forecasting,combined with the attention mechanism and the normal distribution for implementing the optimization of DBLSTM;Next,the time window is used for traffic flow forecasting.Finally,the traffic flow is predicted successfully on Spark framework by a parallel NAW-DBLSTM algorithm with the trajectory data. |