The Techniques Research On Frequent Pattern Mining

Posted on:2006-06-14

Degree:Doctor

Type:Dissertation

Country:China

Candidate:H B Ma

Full Text:PDF

GTID:1118360155460403

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the popularity of computer and information technology, and with the great development of storage technique with high capacity, a great amount of data is accumulated in daily work and in scientific research. How to extract or "mine" useful information from these data is a great challenge for today's research workers in information science. Frequent pattern mining is a basic problem of data mining, including mining transactions, sequences, trees and graphs. The algorithm for it has been prevalently used in many other data mining task, such as association analysis, period's analysis, maximal and closed patterns, query, classification and index technology etc. Since it lays groundwork for other problem and its intrinsic complexity, the algorithm for frequent pattern miming has become the focus of many research workers.Some relevant techniques about frequent pattern mining are addressed in the thesis, which covers the intruduction of Inter-Relevant Successive Trees into the algorithm for frequent pattern mining, mining frequent itemsets and frequent closed itemsets by using static IS-tree, mining embedded frequent trees in a forest of ordered trees by pattern growth method, mining induced frequent trees in a forest of unordered trees, and therelevant implementation technique. Major contributions of this thesis include:1) Frequent Pattern Mining Based on IS+-TreeIS-tree is a novel mathematical model presented recently, which has been successfully applied to full-text index and storage in text database. In this thesis, its application is extended to data mining and an algorithm is presented for mining frequent patterns based on IS+-tree. The algorithm scans the transaction database only once. The mining process is only associated with one root tree. And, IS+-tree can be dynamically updated by increments. Performance comparison study shows that the algorithm isefficient.2) Mining Frequent Patterns Based on Static IS-treeIS+-tree lay emphasis on its generalization, with the loss of efficiency. Thus, a specific static IS-tree is proposed to efficiently mining frequent patterns. The algorithm builds frequent patterns directly, instead of using high-cost candidate sets generation-and-test method used by Apriori. It generates frequent patterns by depth first and pattern growth approach, and works on a static IS-tree, rather than the costly dynamic trees adopted by FP-growth. In order to reduce search space, it uses different...

Keywords/Search Tags:

Data Mining, Frequent Patterns, Frequent Closed Patterns, Frequent Subtrees, Inter-Revelant Sucessive Tree Model

PDF Full Text Request

Related items

1	Study On Frequent Pattern Mining Algorithms And Pruning Strategies
2	The Techniques Research On Frequent Pattern Mining
3	Research On Mining And Querying Frequent Patterns Based On Simplified Frequent Pattern Tree
4	Research On Mining Maximal Frequent Subtrees Uickly And Efficiently
5	Mining Frequent Closed Patterns In Data Streams
6	Research On Data Mining Technology For Very Large Databases
7	The Research On The Distributed Algorithm Of Mining Frequent Closed Patterns
8	Research On Key Algorithms For Mining Frequent Patterns In Data Streams And Their Application
9	The Research And Relization Of Mining Frequent Patterns On Business Data Straems
10	The Research On Frequent Subtrees Mining And Corresponding Techniques