Font Size: a A A

The Techniques Research On Frequent Pattern Mining

Posted on:2006-06-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:H B MaFull Text:PDF
GTID:1118360155460403Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the popularity of computer and information technology, and with the great development of storage technique with high capacity, a great amount of data is accumulated in daily work and in scientific research. How to extract or "mine" useful information from these data is a great challenge for today's research workers in information science. Frequent pattern mining is a basic problem of data mining, including mining transactions, sequences, trees and graphs. The algorithm for it has been prevalently used in many other data mining task, such as association analysis, period's analysis, maximal and closed patterns, query, classification and index technology etc. Since it lays groundwork for other problem and its intrinsic complexity, the algorithm for frequent pattern miming has become the focus of many research workers.Some relevant techniques about frequent pattern mining are addressed in the thesis, which covers the intruduction of Inter-Relevant Successive Trees into the algorithm for frequent pattern mining, mining frequent itemsets and frequent closed itemsets by using static IS-tree, mining embedded frequent trees in a forest of ordered trees by pattern growth method, mining induced frequent trees in a forest of unordered trees, and therelevant implementation technique. Major contributions of this thesis include:1) Frequent Pattern Mining Based on IS+-TreeIS-tree is a novel mathematical model presented recently, which has been successfully applied to full-text index and storage in text database. In this thesis, its application is extended to data mining and an algorithm is presented for mining frequent patterns based on IS+-tree. The algorithm scans the transaction database only once. The mining process is only associated with one root tree. And, IS+-tree can be dynamically updated by increments. Performance comparison study shows that the algorithm isefficient.2) Mining Frequent Patterns Based on Static IS-treeIS+-tree lay emphasis on its generalization, with the loss of efficiency. Thus, a specific static IS-tree is proposed to efficiently mining frequent patterns. The algorithm builds frequent patterns directly, instead of using high-cost candidate sets generation-and-test method used by Apriori. It generates frequent patterns by depth first and pattern growth approach, and works on a static IS-tree, rather than the costly dynamic trees adopted by FP-growth. In order to reduce search space, it uses different...
Keywords/Search Tags:Data Mining, Frequent Patterns, Frequent Closed Patterns, Frequent Subtrees, Inter-Revelant Sucessive Tree Model
PDF Full Text Request
Related items