Font Size: a A A

Research On Algorithms For Mining Maximal Frequent Itemsets

Posted on:2008-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:L S MaFull Text:PDF
GTID:2178360215965726Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data Mining is one of most active research fields, especial in the fields of artificial intelligence and database research. Data Mining is a kind of process that reveals potential useful knowledge from massive data. Mining maximal frequent itemsets is one of most important and fundamental data mining problems.The efficiency of mining maximal frequent itemsets lies on search strategy, data-set representation and superset checking. This thesis fully analyzes some existent algorithms for mining maximal frequent itemsets and discovers these algorithms also have new effective improvement in search strategy, data-set representation and superset checking. Based on these work, this paper presents a new algorithm for mining maximal frequent itemsets, called NDMFIA. This algorithm develops and integrates the following three techniques in order to improve the efficiency of mining maximal frequent itemsets: 1. This thesis uses a pruning strategy which can reduce the search space; 2. This thesis also uses a strategy like PEP in the algorithm MAFIA, the strategy not only reduce the search space, but also reduce the size of FP-tree; A novel concept was developed, called frequent path, which can reduce the size of FP-tree and discover maximal frequent itemsets as early as possible; 3. The MFI-tree in the algorithm FpMAX is used, which also stores all maximal frequent itemsets,but a method of projection is presented can save comparison time of superset checking. In addition, most algorithms for mining maximal frequent itemsets do not consider any domain knowledge, As a result they generate many irrelevant patterns. This paper introduces a constrained condition of domain knowledge into the algorithm NDMFIA, then a algorithm for mining constrained maximal frequent itemsets is built, called NDCMFIA. At last, comparative experiments show that the algorithm NDMFIA outperforms the previously developed algorithms such as MAFIA, FpMAX.
Keywords/Search Tags:data mining, association rules, maximal frequent itemsets, frequent pattern tree, constrained maximal frequent itemsets
PDF Full Text Request
Related items