Font Size: a A A

A Research And Application Of Mining Frequent Patterns Based On Time Series

Posted on:2017-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:B Q ZhengFull Text:PDF
GTID:2308330485486069Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the continuous development of internet in the Big Data era, increasing number of users are engaging in social activities, online shopping and internet surfing and so on. Meanwhile, extensive data are generated in web backend to record users’ actions including browsing, buying and clicking. These data are composed of structured data, semi-structured data and even non-structured data. They also promotes the development of big data technology. Among them, data mining technology appeared as a very popular and important technology in the integration of the massive user behavior data processing and deep patterns found under the actual demand. And frequent pattern mining is an important research direction of data mining technology.In this paper, based on the traditional time series data mining research, we took the time series analysis of meteorological field as the a practical application background,and studied and analyzed four aspects of the problem that based on time series frequent pattern mining,including time series symbolization, frequent itemsets mining, requent sequence mining and frequent pattern mining based on Hadoop platform, and proposed improved plan to key algorithm based on time series symbolization technique and time series mining frequent items mining, thus having achieved some results.Due to the inherent structural characteristics of time series data, such as high dimensional, continuity and all kinds of noise brought by reality observation equipment,making general time sequence processing usually first transform time series into discrete and orderly string, and complete subsequent mining tasks after the transformation of character sequences. In this paper, when mining in time series of meteorological data, in order to better identify the time series’ local trend changes, we put forward the symbolic algorithm based on incremental error. Secondly, This paper is based on the Map-Reduce model of Hadoop to better deal with huge amounts of time series data,realizing the load balance of distributed computing program of FP-growth algorithm. In the end, In the paper, the time series data mining system was also accomplished based on the python that combines the algorithm and the solutions,providing visual graphic interface.
Keywords/Search Tags:time-series, frequent patterns, time-series symbolization, Map-Reduce
PDF Full Text Request
Related items