Research On The Method Of Condensing Association Rules

Posted on:2008-03-25

Degree:Master

Type:Thesis

Country:China

Candidate:H Jiang

Full Text:PDF

GTID:2178360242471990

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Data mining refers to extracting or "mining" knowledge from large amounts of data. As one of the main patterns in the field of data mining, association rules are used to determine the relationships among the attributes or objects, to find out valuable dependencies among the fields. Discovery of frequent itemsets is the most fundamental and important problem in mining association rules. Since the pioneering work of Agrawal, there has been an increase of researching on this problem. When the minimum support is lower or there are too many long patterns, frequent itemsets mining may produce a large quantity of frequent patterns. It is difficult to understand the meaning of these patterns and discover interesting patterns from them. To reduce the set of frequent patterns generated in data mining, recent studies have been working on mining compressed set of frequent patterns, which include maximal patterns and closed patterns.Most of current algorithms of mining maximum frequent itemsets need to protect lots of conditional itemsets and make superset checking. While the amount of the existing maximum frequent itemsets is too large, superset checking will become bottleneck of the algorithm. In this paper, a new algorithm BF-DMFI is proposed, which makes use of adding a mark of each node and signing the node efficiently to reduce the number of maximum conditional frequent itemsets and the cost of superset checking. Comparing with the. current algorithms, the performance of this algorithm is improved in some degree.Width-first algorithm and depth-first algorithm are two kinds of algorithm for mining maximum frequent itemsets. Most of the width-first algorithms need to protect lots of conditional itemsets and scan the database or search FP-Tree many times. And most of the depth-first algorithms need to construct condition pattern tree and mine maximum frequent itemsets recursively, which will consumes more execution time and memory space. In this paper, a new depth-first algorithm BF-DMFI is put forward, which is also based on FP-Tree, but the recursion is not included. We can make use of constructing frequent son-sets and prequent prefix-sets of each frequent node and linking togher to generate maximum frequent itemsets. Making use of MFI-Tree we can make superset checking and prune the FP-Tree effectively, so the function of this algotithm is improved.Most of current algorithms of mining maximum frequent itemsets and frequent closed itemsets have high time and space complexity, because they mine maximum frequent itemsets and frequent closed itemsets from transactional database directly. It designs a new kind of algorithms BFI-DMFI and BFI-DCFI in this paper. We can determine whether a frequent itemsets is maximal frequent itemsets through detecting whether exiting their superset itemsets in frequent itemsets. Also we can make using of mining maximum frequent itemsets from frequent itemsets which have equal support in order to generate frequent closed itemsets. They make better performance than the current algorithms.On the foundation of above-mentioned research, a tool model of mining association rules is designed at the end of this paper. The tool can mine association rules based on frequent itemsets, closed frequent itemsets and maximal frequent itemsets. In order to condense the result set of minging rules, it also can mine consraint-based association rules, which are difined by users.

Keywords/Search Tags:

data mining, association rules, frequent itemsets, maximal frequent itemsets, closed frequent itemse

PDF Full Text Request

Related items

1	Research On Key Algorithms For Mining Frequent Patterns In Data Streams And Their Application In Simulation System
2	Research On Algorithms For Mining Maximal Frequent Itemsets
3	An Algorithm And Context Analysis Of Mining Frequent Closet Itemsets
4	Research On Frequent Closed Itemsets Mining Algorithms
5	Research On Key Algorithms For Mining Frequent Patterns In Data Streams And Their Application
6	Research On Mining Algorithms Of Maximal Frequent Itemsets And Opened Frequent Itemsets
7	Research On Top-K Frequent Itemsets Datamining Algorithm
8	Association Rules Algorithm And Its Applications In Medical Data Mining
9	FP-Tree Based Mining Frequent Itemsets Over Data Streams
10	Study On Frequent Pattern Mining Algorithms And Pruning Strategies