Font Size: a A A

Fp-tree-based Frequent Pattern And Long Pattern Mining

Posted on:2004-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:B Z WangFull Text:PDF
GTID:2208360095450173Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data mining results from the situation described as data rich but information poor. As an important task of data mining, frequent pattern mining is employed in many applications such as mining associations, correlations, causality, sequential patterns, episodes, multi-dimensional patterns, max-patterns, partial periodicity, emerging patterns, etc.For a long time, a category of Apriori-like algorithms has been adopted for mining frequent patterns. But they suffer from taking many scans of databases for huge number of candidate pattern occurrence frequencies checking. FP-Growth algorithm adopts pattern fragment growth method and only scans database twice. It is about an order of magnitude faster than the Apriori algorithm. However, it still encounters performance bottlenecks when creating conditional FP-trees recursively during the mining process. Additionally the algorithm is not adaptable to databases having different characters.We propose two novel frequent pattern mining algorithms: LIFPG and MIFPG which are based on pattern fragment growth method. The two algorithms adopt proper searching strategies. The mining task is processed directly in the originally created and compressed data structure, and no additional data structure is needed, which improves mining efficiency. The performance study shows that they are about four times faster than FP-Growth algorithm, and need about one half space. They scales better for dense data sets.An efficient algorithm for mining max-patterns is proposed based on LIFPG. As a depth-first algorithm, it mines frequent pattern possibly long and adopts divide-and-conquer filtering method, which greatly reduces the search space. The performance shows that it is several times faster than recently proposed algorithms. It also performs well for dense data sets.
Keywords/Search Tags:data mining, frequent pattern, pattern fragment growth, max-pattern
PDF Full Text Request
Related items