Font Size: a A A

Research And Implementation Of Frequent Pattern Mining Algorithms Over Data Streams

Posted on:2008-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:J X JiangFull Text:PDF
GTID:2178360242467359Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Data Mining is a field with high application value in database research. It aims to fetch the implied, unknowed, potential useful information and patterns from large database. It has been maturating through over ten year's development. However, with the development of some applications including ecommerce, sensor network, and stock data analysis etc, a new data module of data stream was proposed. Data in these applications is huge and must be processed according to some sequence. It is a challenging work to mine data streams and has a high value in database research.This paper researches frequent patterns mining in data streams. It analyses the difference between data stream module and the traditional data module, the main data processing skills and the current mining task. Aiming at how to mine the frequent patterns over data streams, the paper researches the classical algorithm of FP-stream. An algorithm of batching approach based on the sliding window and the theory of data stream subsection, called DSFP-SW, is applied to solve this problem of mining long frequent item sets.The algorithm of DSFP-SW bases on batching approach. The data streams are partitioned and every partition is regarded as a sliding window. Then a sliding window is divided into several basic windows. Frequent item sets of every basic window are mined by the existing frequent pattern algorithms. Those item sets are stored in a new data structure of prefix tree called DSFP-SW-tree. In addition, a method, the technique of pruning, is adopted by this algorithm. The frequent item sets in a sliding window can be rapidly found based on this tree as the sliding window is updated. Based on the IBM test data generator, the experimental results show the feasibility and effectiveness of the algorithm.
Keywords/Search Tags:Data Mining, Data Stream Mining, Frequent Patterns, Sliding Window
PDF Full Text Request
Related items