Font Size: a A A

Analysis Of Multivariate Time Series Under Bigdata Environmen

Posted on:2018-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:T LiFull Text:PDF
GTID:2348330518996193Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Hadoop is the factual standard for big data research. With in-depth study of big data,the deficiencies of the Hadoop's MR model was more obvious. In the aspect of iterative computation, Spark has attracted the attentions. In the field of data mining, Spark has been at the forefront of academic research in the underlying algorithms and models. For these reasons, Spark has been widely used in practical production and practice.In this paper, based on the analysis of the problems of Hadoop and Spark, the Spark cluster and Hadoop cluster environment are built for the experiment. After that, it introduces the four modules of Spark, analyzes the basic algorithms of Spark machine learning algorithm library, and summarizes the related knowledge of time series analysis. On this basis,we focus on the multivariate time series model and algorithm on spark platform.Similarity measurement is a basic and important content in time series analysis. Based on the similarity measure of time series, we analyze the similarity of multivariate time series. On the basis of the multidimensional DTW algorithm, we propose a normalization method to eliminate the influence of the data dimension on the data, and implement it on the Spark platform.Vector autoregressive model (VAR) is a model to study the relationships among variables in multivariate time series. It is also one of the most easily operated models for the analysis and prediction of multivariate time series. We designed and implemented VAR and SVAR model based on the Spark framework. In order to verify the performance of the program, we tested different size datasets on R and Spark two platforms. The experimental results show that this scheme is effective when the data is large.Finally, this paper summarizes the full text, especially the shortage of this paper and the future research work.
Keywords/Search Tags:Big Data, Spark, multivariate time series, Dynamic time wraping, Vector autoregressive model
PDF Full Text Request
Related items