Font Size: a A A

Mining Frequent Closed Patterns In Data Streams

Posted on:2008-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z L ChengFull Text:PDF
GTID:2178360242960582Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
A data stream is an unbounded sequence of data arriving at high speed and changing unceasingly along with the time. Frequent patterns mining in data streams has been studied extensively in data mining, with many algorithms proposed and implemented successfully. Because frequent patterns mining often generates a huge number of frequent itemsets and their corresponding association rules, including many redundant, useless ones which are difficult to comprehend and manipulate. Adopting frequent closed patterns may reduce the representation scale of frequent patterns dramatically without loss of information, so it has become an important research issue in mining data streams.Frequent closed patterns mining in data streams is studied in this dissertation. The main contents are as following:(1) The research background and the primary mission of data mining are outlined, and then the definitions, methods as well as main algorithms about association rules in the data mining domain are described.(2) The characteristic of data streams and the data stream management model are described, several classical frequent patterns mining algorithms in data streams are discussed emphatically.(3) The concepts and properties of frequent closed patterns are introduced, as well as mathematical fundamental it based on, and the relation between frequent closed patterns and frequent patterns is described, then closet, a classical algorithm mining frequent closed patterns in finite stored datasets, is elaborated.(4) Based on researches on frequent patterns mining in data streams, a new algorithm (AMFCIDS) is provided for mining the frequent closed patterns in data streams. The data stream is partitioned into a set of segments, and a DSFCI-tree is used to store the potential frequent closed patterns dynamically. With the arrival of each batch of data, the algorithm builds a corresponding local DSFCI-tree firstly, then updates and prunes the global DSFCI-tree effectively to mine the frequent closed patterns in the entire data streams. The experiments and analysis show that the algorithm has good performance.
Keywords/Search Tags:data stream, data mining, association rule, frequent itemsets, frequent closed patterns
PDF Full Text Request
Related items