| Time series data in the field of transportation refers to the data collected sequentially in a certain period of time.It is usually used to objectively describe and record the development law of a thing or phenomenon that changes with time in the course of vehicle driving.By analyzing the time series data in this field,we can greatly excavate the changing rules hidden in the data,thereby making a significant contribution to the construction of harmonious traffic,congestion avoidance and safe driving.Time series forecasting is the main method for analyzing time series data.It can predict the development trend of a certain phenomenon or thing by mining the changing rules hidden in the data and constructing its forecasting model.Therefore,how to construct a time series prediction model in the field of transportation has important practical significance and research value.When constructing a time series prediction model,the following two aspects usually need to be considered.First,because time series data belongs to large-scale data with large amounts of data,various kinds,and reflections of real changes,predictive models usually must have the ability to train large-scale datasets,while online learning is one of the technical to realize large amounts of data learning.Secondly,when collecting data in a nonstationary environment,it is affected by the external environment.So,the distribution of the data may change due to the passage of time,that is,there is a phenomenon of concept drift.Ensemble learning as an auxiliary framework provides an important breakthrough for solving the concept drift problem.Aiming at the problems of huge data to be processed and concept drift in time series prediction,this paper takes the time series data collected in the field of transportation as the basis,and studies how to build a time series prediction model based on online integrated learning through combining with online learning and ensemble learning.The main work and contributions of this paper are as follows:1.Propose an online ensemble model based on nonparametric kernel smoothingOnline learning is a technology that effectively improves the space efficiency of machine learning algorithms,and ensemble learning is widely used in the field of model performance optimization as one of the techniques for implementing combined algorithms.Since the problem that online learning does not allow the parameter selection caused by retraining,an online ensemble regression framework based on nonparametric kernel smoothing is proposed.First,the topology learning neural network is introduced,and the topological neural network is converted into a feedforward neural network through improving the kernel density regression method.The corresponding regression expression is derived.Then,the maximum likelihood process is designed for the adaptive parameter selection of the regression model.Finally,by combining the weighted training strategy of ensemble learning,the performance of the regression prediction model with topology learning is improved.The experimental results of the proposed algorithm on UCI datasets and traffic flow datasets show that the prediction accuracy of this method can be increased by 45.27 % and 54.29 %,respectively.2.Propose an incremental regression prediction model based on classification-type loss functionThe data collected in a non-stationary environment affects by the external environment and change in distribution,that is,the occurrence of concept drift.Different non-stationary environments have different concept drifts,including sudden,rapid,gradual or periodic changes even with different rates of change.Therefore,having a fixed model type and parameter settings causes the performance of traditional time series prediction methods to gradually decline.Aiming at the difficulty of prediction caused by concept drift,an incremental regression model under concept drift environment is proposed,which solves the data distribution changes in non-stationary environments.The model firstly converts the regression task of time series prediction into a binary classification task;secondly,constructs a typed loss function for incremental learning and ensemble learning based on the transformation;and finally,the incremental regression model is obtained by formulating the gradually updated separation hyperplane.The experimental results show that the performance of this method is more stable than the existing incremental and ensemble regression methods,and its prediction accuracy can be improved by 2.89 % —53.41 %.3.Propose an ensemble location prediction model based on online transfer regressionTime series data usually have the points before and after it are not necessarily adjacent moments,that is,the problem of missing data may occur.Especially in the GPS-based acquisition process,due to different levels of acquisition frequency,accuracy of the acquisition device,base station signal strength,and GPS signal strength,the observed value of the position sequence data collected on the time scale is not accurate Is even missing.Aiming at the problem of loss of position time series data caused by GPS signal outage,a transfer regression model under non-stationary environment is proposed.The model first fuses GPS data and auxiliary vehicle driving data before training;then,during the lack of data,transfer learning is used to reduce the weight of training samples that are not conducive to the current situation;finally,an online transfer regression model is obtained by establishing a classification type loss function for ensemble regression learning.The experiment uses a real vehicle position data set to verify,and the results show that the proposed method improves the prediction accuracy by 13.47 %—61.51 % compared with the existing method.4.Propose an online ensemble LSTM prediction model based on adaptive classificationtype weighting strategyA single long-term and short-term memory neural network(LSTM)due to its special network structure causes many network parameters to be determined during training.And within each time step,the LSTM memory unit will be quickly modified and updated to dynamically adjust the internal network.Aiming at the problems of large quantity of parameter optimization for a single LSTM and the difficulty of capturing internal changes,an online ensemble LSTM prediction model based on adaptive weighting strategy is proposed.The model first constructs a classification type loss function for regression tasks based on the idea of virtual binary classification;then,in order to capture the change of the basic LSTM model at each time step,an adaptive classification type weighting strategy is designed to obtain LSTM-based online ensemble prediction model.Experimental results on the application of traffic speed prediction show that the proposed prediction model is improved by35.13%—48.02% compared with a single LSTM model,and 14.5%—28.45% compared with the existing ensemble LSTM method. |