Font Size: a A A

Fp-tree-based Association Rule Mining Algorithm Design And Implementation

Posted on:2006-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:N L LiuFull Text:PDF
GTID:2208360155966198Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data mining technology is an effective approach to resolve the problem of abundant data and scanty information.It currently is the research frontier within the information science field.The related researches and applications have greatly improved the ability for decision supporting.It has been deemed to a field that has broad prospect of application in database research.This paper describes the conception,function and patterns of data mining.In many data mining algorithms,mining association rule is an important matter in data mining, in which mining frequent itemsets is a key problem in mining association rule.because maximum frequent itemsets embrace all frequent itemsets,the problem of mining frequent itemset is converted to the problem of mining maximum frequent itemsets.Mining maximal frequent itemsets is very important in data mining.Many of the previous algorithms mine maximal frequent itemsets by producing candidate itemsets firstly,then pruning.But the cost of producing candidate itemsets is very high ,especially when there exist long patterns.This paper studies mostly the problem of mining maiximal frequent pattern based on FP-tree.Firstly, we study the definition and construction of FP-tree and improved algorithms and analyze the feasibility and completeness of FP-tree.Then,we propose the algorithm for mining maximal frequent pattern Max-FI,which need not produce maximal candidate itemsets. The improved FP-Tree is a one-way tree and there is no pointers to point its children in each node,so at least one third of memory is saved.At last,our experimental result shows that the algorithm Max-FI is more effectively than the algorithm DMFIA based on FP-tree for mining maximal frequent patterns.Secondly,we study the problem of mining valid and non-Redundant association rules. The traditional algorithm mining association rules,or slowly produces association rules,or produces too many redundant rules,or it is probable to find an association rule,which posses high support and confidence,but is uninteresting,and even is false.Furthermore,a rule with negative-item can't be produced.This paper propose a new algorithm MVNR(Mining Valid and non-Redundant Association Rules Algorithm),in this algorithm,firstly,we examine frequent patterns,delete thefrequent patterns which only produce the redundant association rules.Then,we produce the minimal subset of each frequent itemset in the examined frequent itemset,delete the minimal subset of existing in the minimal subset which is super subset of this minimal subset by using the character of minimal subset.At last,we produce association rules according to the conditions which user define.At last,we study the problem of the application of mining maximal frequent patterns algorithm(Max-FI) and mining valid and non-redundant association rules algorithm(MVNR) in the decision supporting system of Gong An.
Keywords/Search Tags:Association rule, Maximum Frequent Itemset, Frequent Pattern Tree, Correlation, Redundancy
PDF Full Text Request
Related items