Font Size: a A A

Research On Method Of Time Series Data Processing Based On Hadoop Platform

Posted on:2016-10-18Degree:MasterType:Thesis
Country:ChinaCandidate:D WangFull Text:PDF
GTID:2308330470957738Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology, a variety of information systems have been widely used in people’s daily life. These systems produce a massive time series data. Today how to use these time series data efficiently and mine useful information is a hot topic in data processing.Firstly, we analyze the key technologies of existing distributed systems. Then, do some research on time series data analysis. Finally, we propose a method and system for incremental processing time series data on Hadoop platform. The main works of this dissertation are as follows:1. We analyze the key technologies in distributed systems and Hadoop platform. In this paper, firstly, we analyze the key technologies of distributed systems. And then we construct a Hadoop platform and give a practical data processing exper-iment based on MapReduce computation model. Finally, we analyze the experi-mental result in detail.2. We analyze and improve time series prediction and time series similarity measure algorithms. In this paper, firstly, we have a good understand of the real data and model the data and then predict the time series data. Finally, on the basis of analyzing the time series similarity measure algorithm, we propose an algorithm Inc-DTW to support incremental data time series similarity measure. Moreover, we show the efficiency of the algorithm from experiment and theory.3. During data processing, data is increasing with time passing, how to efficiently compute the incremental data is currently a hot research topic in the field of data processing. In this paper, combining with the characteristics of Hadoop plat-form and time series data, we propose a time series data incremental processing method and system TSI-Hadoop based on Hadoop platform. TSI-Hadoop has the following features:(1) Provides a common support for temporal data processing algorithm;(2) Proposes segment time series data incremental calculation mode based on MapReduce computation model;(3) Proposes a method of sliding win-dow incremental calculation with state according to the characteristics of time series data. At last, we do some experiments to verify the effectiveness of the method and system.
Keywords/Search Tags:time series data, Hadoop, time series data analysis, incremental computa-tion
PDF Full Text Request
Related items