Font Size: a A A

Multi-demension Time Series Modeling And Forcasting Analysis

Posted on:2015-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q YangFull Text:PDF
GTID:2268330425488900Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The actual production process will produce a lot of time series data, time series analysis can provide decision basis for the actual production activities. This paper presents solutions for following problems:1. Learning and forcasting for high dimension time series data with missing value. In this paper, a method for forecasting high dimensional time series data with missing value based on frequency feature clustering and DynaMMo Algorithm named WFDynaMMo has been proposed. Firstly, denoising the high dimensional time series data by wavelet; Secondly, Extracting the frequency feature of the denoised high dimensional time series data by Discrete Fourier Transform (DFT) and Window Method; thirdly, clustering the original input time series data according to the frequency correlation among dimensions of the denoised high dimensional time series data; Finally, analyzing each class of the clustering result independently by DynaMMo. Experiments on the simulation time series data and the real world time series data indicated that missing value rate not more than10%has little effect on the accuracy of frequency correlation clustering and WFDynaMMo method has higher precise than the DynaMMo Algorithm.2. Massive distributed time series analysis problem, analysis algorithms named MR-LDS and MSAX-MR-LM have been proposed.An improved method named MR-LDS, which is based on MapReduce compute model, for the multiple time series parallel analysis algorithm has been proposed. Firstly, Mapper nodes in the MR-LDS algorithm cut the massive time series into independent subsequences, then Reducer nodes in the MR-LDS will analyse all the subsequences independently. Secondly, combine all the output from Reducers of the MR-LDS into new global model parameters. Finally, using real world data sets validate the effectiveness of the method.Another massive distributed time series analysis method named MSAX-MR-LM, which based on MapReduce and Statistics Language Model, has been proposed. It based on MapReduce compute model. Firstly, interpret the real world multiple time series date into symbolic representation with multiple time series symbolic method MSAX. Secondly, modeling the symbolic representation of the real world time series data with Statistic Language Model. Experiments on real data sets show that the method can meet the needs of massive time series analysis, and real-time analysis of the sequence can be well supported.Using the MR-LDS and the MSAX-MR-LM algorithms on real data sets to study modeling experiment, draw the following conclusions:1) The accuracy of large data sets analysis(a) MR-LDS smaller dataset (time<104), can obtain higher accuracy.(b) MSAX-MR-LM larger dataset (time>107), can obtain higher analysis accuracy.2) Rate of large data sets analysis(a) analysis time of MR-LDS algorithm grows exponentially with data set size.(b) analysis time of MR-LDS algorithm grows linearly with data set size.3) Real-time analysis(a) MR-LDS learning model parameters are obtained in the new data (real time data) can not be reused, resulting in repeated calculation, modeling and analysis of real-time learning cannot be.(b) MSAX-MR-LM algorithm to learn the statistical language model is obtained in the new data (real time data) can be reused, can realize the real-time learning modeling analysis.
Keywords/Search Tags:High Dimensional and Incomplete Time Series, Distributed Time SeriesAnalysis, Frequency Correlation, Clustering, Multidimensional Time Series SymbolicMethod
PDF Full Text Request
Related items